[Solved] How can I impute values to outlier cells based on groups? [closed]

Using the following answer from n1k31t4 in: https://datascience.stackexchange.com/questions/37717/imputation-missing-values-other-than-using-mean-median-in-python I was able to solve my problem. df[col]=df.groupby([‘X’, ‘Y’])[col].transform(lambda x: x.median() if (np.abs(x)>3).any() else x) solved How can I impute values to outlier cells based on groups? [closed]

[Solved] Is there way to replace ranged data (eg 18-25) by its mean in a dataframe?

There are several ways to transform this variable. In the picture I see, that there are not only bins, but also value ’55+’, it needs to be considered. 1) One liner: df[‘age’].apply(lambda x: np.mean([int(x.split(‘-‘)[0]), int(x.split(‘-‘)[1])]) if ‘+’ not in x else x[:-1]) It checks whether the value contains ‘+’ (like 55+), if yes than the … Read more

[Solved] Pivoting a One-Hot-Encode Dataframe

Maybe I’m missing something but doesn’t this work for you? agg = df.groupby(‘number_of_genres’).agg(‘sum’).T agg[‘totals’] = agg.sum(axis=1) Edit: Solution via pivot_table agg = df.pivot_table(columns=”number_of_genres”, aggfunc=”sum”) agg[‘total’] = agg.sum(axis=1) 2 solved Pivoting a One-Hot-Encode Dataframe

[Solved] Compare strings of a column in a dataframe with a set of words in a list

Ok, let’s assume we have a dataframe data and list negative_words like this: data = pd.DataFrame({ ‘Tweets’ : [‘This is bad’, ‘This is terrible’, ‘This is good’, ‘This is great’], }) negative_words = [‘bad’, ‘terrible’] We can then do something like: 1) We can use a lambda function with any: # create lambda with any: … Read more

[Solved] How can I start writing a program in python where it reads an excel file with few records and generate more record for testing purpose

How can I start writing a program in python where it reads an excel file with few records and generate more record for testing purpose solved How can I start writing a program in python where it reads an excel file with few records and generate more record for testing purpose

[Solved] Get stock data problems [closed]

There’s nothing wrong with your code. However recent stock data is a Premium product from Quandl and I presume you are just on the free subscription, hence your dataframe comes back empty. If you change the dates to 2017, you will get some results but that’s as far as it goes on the free subscription … Read more

[Solved] Dataframe: Computed row based on cell above and cell on the left

I think you need Series.cumsum with select last row (total row) by DataFrame.iloc: df = pd.DataFrame({ ‘B’:[4,5,4], ‘C’:[7,8,9], ‘D’:[1,3,5], ‘E’:[5,3,6], }) df.loc[‘sum’] = df.sum() df.loc[‘cumsum’] = df.iloc[-1].cumsum() #if need only cumsum row #df.loc[‘cumsum’] = df.sum().cumsum() print (df) B C D E 0 4 7 1 5 1 5 8 3 3 2 4 9 5 … Read more

[Solved] Python pandas plotting and groupby [closed]

Because you change the question here is the updated answer: See comments in code import pandas as pd import matplotlib.pyplot as plt from matplotlib import style style.use(‘ggplot’) %matplotlib inline # read your dataframe and sort df = pd.read_clipboard() df.drop(columns=[‘length’], inplace=True) df.rename(columns={‘Text.1’: ‘Text length’}, inplace=True) df.sort_values([‘Text’, ‘Tag’, ‘Time’], inplace=True) x = list(df[‘Time’]) # set x axis … Read more

[Solved] Pandas Python: KeyError Date

This looks like an excel datetime format. This is called a serial date. To convert from that serial date you can do this: data[‘Date’].apply(lambda x: datetime.fromtimestamp( (x – 25569) *86400.0)) Which outputs: >>> data[‘Date’].apply(lambda x: datetime.fromtimestamp( (x – 25569) *86400.0)) 0 2013-02-25 10:00:00.288 1 2013-02-26 10:00:00.288 2 2013-02-27 10:00:00.288 3 2013-02-28 10:00:00.288 To assign it … Read more

[Solved] Storing values in a CSV file into a list in python

Please never use reserved words like list, type, id… as variables because masking built-in functions. If later in code use list e.g. list = data[‘FirstNames’].tolist() #another solution for converting to list list1 = list(data[‘SecondNames’]) get very weird errors and debug is very complicated. So need: L = data[‘FirstNames’].tolist() Or: L = list(data[‘FirstNames’]) Also can check … Read more

[Solved] I want a pandas script to line up values from one excel sheet to another based on the values in the first spreadsheet

Commented for explanation of approach. Have found two addresses where ID from sheet2 comes back onto sheet1 import io sheeta = pd.read_csv(io.StringIO(“”” house_number street suburb 0 43 Smith Street Frewville 1 45 Smith Street Frewville 2 47 Smith Street Frewville 3 49 Smith Street Frewville 4 51 Smith Street Frewville 5 53 Smith Street Frewville … Read more

[Solved] How can I turn this txt file to a pandas DataFrame?

As MrSmily2019 said you will want to use https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html. It does more than just CSV, it can do text. Additionally you can turn text files into csv. You file seems to be “TAB” delimited (how it know to separate) instead of comma. You can adjust the settings so it knows to do it on the … Read more