pandas Archives - Page 5 of 10

[Solved] Removing last words in each row in pandas dataframe

November 5, 2022 by Kirat

We can try using str.replace here: df[“first_middle_name”] = df[“full_name”].replace(“\s+\S+$”, “”) See the above regex replacement working in the demo link below. Demo solved Removing last words in each row in pandas dataframe

[Solved] Read Excel file which has one of the column as Hyperlink through python

November 2, 2022 by Kirat

Pandas library does not have the functionality to parse hyperlink as of yet. You could preprocess the excel using libraries such as xlrd or openpxyl to get hyperlinks then use it in pandas. Source: https://stackoverflow.com/a/45446810/7570485 There is also a feature request for this functionality. https://github.com/pandas-dev/pandas/issues/13439 1 solved Read Excel file which has one of the … Read more

[Solved] Merges and joins in pandas

November 1, 2022 by Kirat

See this. I had a datframe and it showed me like this, but the datframe was one intact data frame… solved Merges and joins in pandas

[Solved] Map column birthdates in python pandas df to astrology signs

October 30, 2022 by Kirat

You can apply the zodiac_sign function to the dataframe as – import pandas as pd from io import StringIO # Sample x = StringIO(“””birthdate,answer,YEAR,MONTH-DAY 1970-03-31,5,1970,03-31 1970-05-25,9,1970,05-25 1970-06-05,3,1970,06-05 1970-08-28,2,1970,08-28 “””) df = pd.read_csv(x, sep=’,’) df[‘birthdate’] = pd.to_datetime(df[‘birthdate’]) df[‘zodiac_sign’] = df[‘birthdate’].apply(lambda x: zodiac_sign(x.day, x.strftime(“%B”).lower())) print(df) Output: birthdate answer YEAR MONTH-DAY zodiac_sign 0 1970-03-31 5 1970 03-31 aries … Read more

[Solved] Left align the first column and center align the other columns in a Pandas table

October 26, 2022 by Kirat

The table can be pretty formatted in Pandas by assembling the two missing formatting conditions into a single df. I made the following two changes to the original code. Hide index numbers with hide_index() df[[“Unit”, “Abbreviation”, “Storage”]].style.hide_index() To apply to a subset of columns, you can use the subset parameter. Left align the first column … Read more

[Solved] Need Pictorial representation for the below question, (eg: histogram,swarmplot,etc)

October 24, 2022 by Kirat

I recommend https://bookdown.org/ndphillips/YaRrr/ as a good introduction to R that includes a big section on data visualisation. solved Need Pictorial representation for the below question, (eg: histogram,swarmplot,etc)

[Solved] Removing duplicates every 5 minutes [closed]

October 24, 2022 by Kirat

Start from adding DatTim column (of type DateTime), taking source data from Date and Time: df[‘DatTim’] = pd.to_datetime(df.Date + ‘ ‘ + df.Time) Then, assuming that ID is an “ordinary” column (not the index), you should call: groupby on DatTim column with 5 min frequency. To each group apply drop_duplicates, with subset including only ID … Read more

[Solved] pd.DataFrame(np.random.randn(8, 4), index=dates, columns=[‘A’, ‘B’, ‘C’, ‘D’])

October 24, 2022 by Kirat

Basically np.random.randn returns random float values of normal distributions with mean = 0 and variance = 1. Now np.random.randn takes shape you would like to return of those distributions. For example: np.random.randn(1,2) returns an array of one row and two columns. Similarly, you can give np.random.randn(1,.,.,.,9) which gives you out a complicated array. Since you … Read more

[Solved] Subsetting the data frame and applying cumulative operation on multiple columns

October 23, 2022 by Kirat

Hopefully I got it this time: subdf = df.iloc[3:, 1:4] df[‘flag’] = 1 if subdf.values.sum()/subdf.size >= 0.1 else 0 output: unit A B C row_num flag 0 ABC 1 1 1 7 1 1 DEF 1 1 1 6 1 2 GEH 1 1 1 5 1 3 IJK 0 1 0 4 1 4 … Read more

[Solved] How to split a string without given delimeter in Panda

October 23, 2022 by Kirat

Assuming your split criteria is by fixed number of characters (e.g. 5 here), you can use: df[‘dfnewcolumn1’] = df[‘dfcolumn’].str[:5] df[‘dfnewcolumn2’] = df[‘dfcolumn’].str[5:] Result: dfcolumn dfnewcolumn1 dfnewcolumn2 0 PUEF2CarmenXFc034DpEd PUEF2 CarmenXFc034DpEd 1 PUEF2BalulanFc034CamH PUEF2 BalulanFc034CamH 2 CARF1BalulanFc013Baca CARF1 BalulanFc013Baca If your split criteria is by the first digit in the string, you can use: df[[‘dfnewcolumn1’, ‘dfnewcolumnX’]] … Read more

[Solved] Detect sign changes in an array [closed]

October 23, 2022 by Kirat

This is a simple code that resolve your problem/exemple: l = [ 1, 2, 6, -3, -2, -5, 6, 7, 1, -1, -3] for i in range(0, len(l)-1): p = l[i] * l[i+1] if p > 0: print(‘ok’) elif l[i+1] < 0: print(‘sell’) else: print(‘buy’) And if you want a new column in your dataframe, … Read more

[Solved] Pandas: filter data frame by category

October 22, 2022 by Kirat

You can use pandas groupby method with list comprehension which will do the JOb like Below: >>> df X Y 0 Yes 1 1 No 2 2 Yes 3 3 Yes 4 4 No 2 5 No 1 6 Yes 0 7 No 4 8 No 4 9 No 5 >>> {k: v[“Y”].tolist() for k,v … Read more

[Solved] Can I change the dtype of mean function in Pandas? I want to change it to int type [closed]

October 21, 2022 by Kirat

You can then apply a transformation to the result. df[‘mean’] = df[‘mean’].apply(lambda x: round(x)) or, if you want to truncate: df[‘mean’] = df[‘mean’].apply(lambda x: int(x)) 0 solved Can I change the dtype of mean function in Pandas? I want to change it to int type [closed]

[Solved] Extract numbers, letters, or punctuation from left side of string column in Python

October 21, 2022 by Kirat

Use Series.str.extract with DataFrame.pop for extract column: pat = r'([\x00-\x7F]+)([\u4e00-\u9fff]+.*$)’ df[[‘office_name’,’company_info’]] = df.pop(‘company_info’).str.extract(pat) print (df) id office_name company_info 0 1 05B01 北京企商联登记注册代理事务所(通合伙) 1 2 Unit-D 608 华夏启商(北京企业管理有限公司) 2 3 1004-1005 北京中睿智诚商业管理有限公司 3 4 17/F(1706) 北京美泰德商务咨询有限公司 4 5 A2006~A2007 北京新曙光会计服务有限公司 5 6 2906-10 中国建筑与室内设计师网 11 solved Extract numbers, letters, or punctuation from left side of string … Read more

[Solved] Loop: Results to be updated via += if on the same date, otherwise write next line

October 14, 2022 by Kirat

You can use DataFrame.append and then group by the column you want to use as an index. Let’s say that total_df is the table to which you want to add new rows, new_df is the table containing those new rows and date is the column to be used as index. Then you can use: total_df.append(new_df).groupby(by=’date’).sum() … Read more