pandas Archives - Page 8 of 10

[Solved] How can I impute values to outlier cells based on groups? [closed]

September 25, 2022 by Kirat

Using the following answer from n1k31t4 in: https://datascience.stackexchange.com/questions/37717/imputation-missing-values-other-than-using-mean-median-in-python I was able to solve my problem. df[col]=df.groupby([‘X’, ‘Y’])[col].transform(lambda x: x.median() if (np.abs(x)>3).any() else x) solved How can I impute values to outlier cells based on groups? [closed]

[Solved] Is there way to replace ranged data (eg 18-25) by its mean in a dataframe?

September 24, 2022 by Kirat

There are several ways to transform this variable. In the picture I see, that there are not only bins, but also value ’55+’, it needs to be considered. 1) One liner: df[‘age’].apply(lambda x: np.mean([int(x.split(‘-‘)[0]), int(x.split(‘-‘)[1])]) if ‘+’ not in x else x[:-1]) It checks whether the value contains ‘+’ (like 55+), if yes than the … Read more

[Solved] Pivoting a One-Hot-Encode Dataframe

September 23, 2022 by Kirat

Maybe I’m missing something but doesn’t this work for you? agg = df.groupby(‘number_of_genres’).agg(‘sum’).T agg[‘totals’] = agg.sum(axis=1) Edit: Solution via pivot_table agg = df.pivot_table(columns=”number_of_genres”, aggfunc=”sum”) agg[‘total’] = agg.sum(axis=1) 2 solved Pivoting a One-Hot-Encode Dataframe

[Solved] Compare strings of a column in a dataframe with a set of words in a list

September 21, 2022 by Kirat

Ok, let’s assume we have a dataframe data and list negative_words like this: data = pd.DataFrame({ ‘Tweets’ : [‘This is bad’, ‘This is terrible’, ‘This is good’, ‘This is great’], }) negative_words = [‘bad’, ‘terrible’] We can then do something like: 1) We can use a lambda function with any: # create lambda with any: … Read more

[Solved] How can I start writing a program in python where it reads an excel file with few records and generate more record for testing purpose

September 20, 2022 by Kirat

How can I start writing a program in python where it reads an excel file with few records and generate more record for testing purpose solved How can I start writing a program in python where it reads an excel file with few records and generate more record for testing purpose

[Solved] Get stock data problems [closed]

September 19, 2022 by Kirat

There’s nothing wrong with your code. However recent stock data is a Premium product from Quandl and I presume you are just on the free subscription, hence your dataframe comes back empty. If you change the dates to 2017, you will get some results but that’s as far as it goes on the free subscription … Read more

[Solved] Dataframe: Computed row based on cell above and cell on the left

September 19, 2022 by Kirat

I think you need Series.cumsum with select last row (total row) by DataFrame.iloc: df = pd.DataFrame({ ‘B’:[4,5,4], ‘C’:[7,8,9], ‘D’:[1,3,5], ‘E’:[5,3,6], }) df.loc[‘sum’] = df.sum() df.loc[‘cumsum’] = df.iloc[-1].cumsum() #if need only cumsum row #df.loc[‘cumsum’] = df.sum().cumsum() print (df) B C D E 0 4 7 1 5 1 5 8 3 3 2 4 9 5 … Read more

[Solved] Python pandas plotting and groupby [closed]

September 18, 2022 by Kirat

Because you change the question here is the updated answer: See comments in code import pandas as pd import matplotlib.pyplot as plt from matplotlib import style style.use(‘ggplot’) %matplotlib inline # read your dataframe and sort df = pd.read_clipboard() df.drop(columns=[‘length’], inplace=True) df.rename(columns={‘Text.1’: ‘Text length’}, inplace=True) df.sort_values([‘Text’, ‘Tag’, ‘Time’], inplace=True) x = list(df[‘Time’]) # set x axis … Read more

[Solved] xlwings udf function erase next cell on excel

September 18, 2022 by Kirat

It seems that in the documentation of xlwings, it is necessary to have an empty row and column at the bottom and to the right. if not it will overwrite it http://docs.xlwings.org/en/stable/api.html#xlwings.xlwings.ret solved xlwings udf function erase next cell on excel

[Solved] Pandas Python: KeyError Date

September 18, 2022 by Kirat

This looks like an excel datetime format. This is called a serial date. To convert from that serial date you can do this: data[‘Date’].apply(lambda x: datetime.fromtimestamp( (x – 25569) *86400.0)) Which outputs: >>> data[‘Date’].apply(lambda x: datetime.fromtimestamp( (x – 25569) *86400.0)) 0 2013-02-25 10:00:00.288 1 2013-02-26 10:00:00.288 2 2013-02-27 10:00:00.288 3 2013-02-28 10:00:00.288 To assign it … Read more

[Solved] I want a pandas script to line up values from one excel sheet to another based on the values in the first spreadsheet

September 16, 2022 by Kirat

Commented for explanation of approach. Have found two addresses where ID from sheet2 comes back onto sheet1 import io sheeta = pd.read_csv(io.StringIO(“”” house_number street suburb 0 43 Smith Street Frewville 1 45 Smith Street Frewville 2 47 Smith Street Frewville 3 49 Smith Street Frewville 4 51 Smith Street Frewville 5 53 Smith Street Frewville … Read more

[Solved] How to reuse the terminal after I ran a Bokeh app in the server

September 16, 2022 by Kirat

Press Ctrl + C to send a terminating signal, this will close the Bokeh server so you can reuse the terminal. (This isn’t just for Bokeh, it also works for other applications!) 1 solved How to reuse the terminal after I ran a Bokeh app in the server

[Solved] How can I turn this txt file to a pandas DataFrame?

September 16, 2022 by Kirat

As MrSmily2019 said you will want to use https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html. It does more than just CSV, it can do text. Additionally you can turn text files into csv. You file seems to be “TAB” delimited (how it know to separate) instead of comma. You can adjust the settings so it knows to do it on the … Read more