pandas Archives - Page 4 of 10

[Solved] Create a Dataframe with lists of different sizes [closed]

November 25, 2022 by Kirat

Thanks Everyone. Sorry that I’m not clear. I just want to explore PANDAS more. I know this isn’t going to be useful in any of the real world situations but i’m curious. Below is the way I’m able to achieve what I was looking for. Please let me know if there are any other efficient … Read more

[Solved] pandas column selection using boolean values from another dataframe

November 24, 2022 by Kirat

Let’s take a simplified example, so I can show the process here. language = pd.Index([‘zh’, ‘zh’, ‘zh’, ‘zh’, ‘zh’, ‘zh’, ‘zh’, ‘zh’, ‘zh’,’zh’,’na’, ‘na’, ‘na’, ‘na’, ‘na’, ‘na’, ‘na’, ‘na’, ‘na’, ‘na’], dtype=”object”, name=”Page”) web = pd.DataFrame(columns = range(len(language))) web.shape (0, 20) language.shape (0, 20) So both have the same number of columns, and you … Read more

[Solved] Using python and pandas I need to paginate the results from a sql query in sets of 24 rows into a plotly table . How do I do this?

November 24, 2022 by Kirat

Using python and pandas I need to paginate the results from a sql query in sets of 24 rows into a plotly table . How do I do this? solved Using python and pandas I need to paginate the results from a sql query in sets of 24 rows into a plotly table . How … Read more

[Solved] How to merge data into python

November 23, 2022 by Kirat

try to understand your question, as far as I can do now… df = pandas.read_csv(‘your file name’,sep=’;’, header=None) df = df.groupby([1]).mean().reset_index() then pick the column you need. 2 solved How to merge data into python

[Solved] Pandas – Count Vectorize Series of Transaction Activities by User [closed]

November 16, 2022 by Kirat

The pivot_table function in pandas should do what you want. For instance: import pandas as pd frame = pd.read_csv(‘myfile.csv’, header=None) frame.columns = [‘user_id’, ‘date’, ‘event_type’] frame_pivoted = frame.pivot_table( index=’user_id’, columns=”event_type”, aggfunc=”count” ) In general, using vectorized Pandas functions is much faster than for loops, although I haven’t compared the performance in your specific case. 0 … Read more

[Solved] splitting url and getting values from that URl in columns

November 16, 2022 by Kirat

try using a str.split and add another str so you can index each row. data = [{‘ID’ : ‘1’, ‘URL’: ‘https://ckd.pdc.com/pdc/73ba5189-94fd-44aa-88d3-6b36aaa69b02/DDA1610095.zip’}] df = pd.DataFrame(data) print(df) ID URL 0 1 https://ckd.pdc.com/pdc/73ba5189-94fd-44aa-88d… #Get the file name and replace zip (probably a more elegant way to do this) df[‘Zest’] = df.URL.str.split(“https://stackoverflow.com/”).str[-1].str.replace(‘.zip’,”) #assign the type into the next column. … Read more

[Solved] How to find and calculate the number of duplicated rows between two different dataframe? [closed]

November 11, 2022 by Kirat

You can sorted both DataFrames – columns c_x and c_y, for movies is used DataFrame.pivot, count non missing values by DataFrame.count and append to df1: df2[[‘c_x’,’c_y’]] = np.sort(df2[[‘c_x’,’c_y’]], axis=1) df2[‘g’] = df2.groupby([‘c_x’,’c_y’]).cumcount().add(1) df2 = df2.pivot(index=[‘c_x’,’c_y’], columns=”g”, values=”movie”).add_prefix(‘movie’) df2[‘number’] = df2.count(axis=1) print (df2) g movie1 movie2 number c_x c_y bob dan c f 2 uni a … Read more

[Solved] How to iterate a vectorized if/else statement over additional columns?

November 11, 2022 by Kirat

Option 1 You can nest numpy.where statements: org[‘LT’] = np.where(org[‘ID’].isin(ltlist_set), 1, np.where(org[‘ID2’].isin(ltlist_set), 2, 0)) Option 2 Alternatively, you can use pd.DataFrame.loc sequentially: org[‘LT’] = 0 # default value org.loc[org[‘ID2’].isin(ltlist_set), ‘LT’] = 2 org.loc[org[‘ID’].isin(ltlist_set), ‘LT’] = 1 Option 3 A third option is to use numpy.select: conditions = [org[‘ID’].isin(ltlist_set), org[‘ID2’].isin(ltlist_set)] values = [1, 2] org[‘LT’] = … Read more

[Solved] How to check whether data of a row is in list, inside of np.where()?

November 11, 2022 by Kirat

You can use .isin() directly in pandas filtering – recent_indicators_filtered = recent_indicators[recent_indicators[‘CountryCode’].isin(developed_countries)] Also, you can come up with a boolean column that says True if developed – recent_indicators[‘Developed’] = recent_indicators[‘CountryCode’].isin(developed_countries) solved How to check whether data of a row is in list, inside of np.where()?

[Solved] python progrom to find the null values in row and return the column name

November 10, 2022 by Kirat

You should edit your question so that it becomes clear to everyone, e.g. what format you used, what you have tried to get that output and so on. Actually, it’s an interesting problem. I assume you have above data in pandas’ dataframe format. It’s also important to add more tags related to your question. Anyway, … Read more

[Solved] Search for column values in another column and assign a value from the next column from the row found to another column

November 9, 2022 by Kirat

You can try creating a dictionary from columns [‘CheckStringHere’,’AssociatedValue1′] and replace values from StringToCheck column: d = dict(df[[‘CheckStringHere’,’AssociatedValue1′]].to_numpy()) df[‘FromNumber’] = df[‘StringToCheck’].replace(d) #or df[‘FromNumber’] = df[‘StringToCheck’].map(d).fillna(df[‘FromNumber’]) print(df) StringToCheck FromNumber ToNumber CheckStringHere AssociatedValue1 \ 0 10T 56 AAA_ER 1 125T 16 FGGR_DBC 2 10T 56 3 125T 16 AssociatedValue2 0 1 2 58 3 24 2 solved … Read more

[Solved] Flatten list of lists within dictionary values before processing in Pandas

November 7, 2022 by Kirat

As a follow up to the original post. I managed to resolve the issue, and flattened the lists within the dictionary, with the help of the following generator function: Taken from here: def flatten(l): for el in l: if isinstance(el, collections.Iterable) and not isinstance(el, basestring): for sub in flatten(el): yield sub else: yield el And … Read more

[Solved] Probability Distribution Function Python

November 7, 2022 by Kirat

This is a minimal working example for showing a histogram. It only solves part of your question, but it can be a step towards your goal. Note that the histogram function gives you the values at the two corners of the bin and you have to interpolate to get the center value. import numpy as … Read more

[Solved] Forecasting basis the historical figures

November 7, 2022 by Kirat

You need to join those two dataframes to perform multiplication of two columns. merged_df = segmentallocation.merge(second,on=[‘year’,’month’],how=’left’,suffixes=[”,’_second’]) for c in interested_columns: merged_df[‘allocation’+str(c)] = merged_df[‘%of allocation’+str(c)] * merged_df[c] merged_df year month segment x y z k %of allocationx %of allocationy %of allocationz %of allocationk x_second y_second z_second k_second allocationx allocationy allocationz allocationk 0 2018 FEB A 2094663 … Read more

[Solved] Matplotlib spacing in xaxis

November 6, 2022 by Kirat

Is this what you want?, try adding the below lines of code to your code: plt.xticks(rotation=90) plt.gca().margins(x=0) plt.gcf().canvas.draw() tl = plt.gca().get_xticklabels() maxsize = max([t.get_window_extent().width for t in tl]) m = 0.2 # inch margin s = maxsize/plt.gcf().dpi*150+2*m margin = m/plt.gcf().get_size_inches()[0] plt.gcf().subplots_adjust(left=margin, right=1.-margin) plt.gcf().set_size_inches(s, plt.gcf().get_size_inches()[1]) 4 solved Matplotlib spacing in xaxis