[Solved] Read Excel file which has one of the column as Hyperlink through python

Pandas library does not have the functionality to parse hyperlink as of yet. You could preprocess the excel using libraries such as xlrd or openpxyl to get hyperlinks then use it in pandas. Source: https://stackoverflow.com/a/45446810/7570485 There is also a feature request for this functionality. https://github.com/pandas-dev/pandas/issues/13439 1 solved Read Excel file which has one of the … Read more

[Solved] Map column birthdates in python pandas df to astrology signs

You can apply the zodiac_sign function to the dataframe as – import pandas as pd from io import StringIO # Sample x = StringIO(“””birthdate,answer,YEAR,MONTH-DAY 1970-03-31,5,1970,03-31 1970-05-25,9,1970,05-25 1970-06-05,3,1970,06-05 1970-08-28,2,1970,08-28 “””) df = pd.read_csv(x, sep=’,’) df[‘birthdate’] = pd.to_datetime(df[‘birthdate’]) df[‘zodiac_sign’] = df[‘birthdate’].apply(lambda x: zodiac_sign(x.day, x.strftime(“%B”).lower())) print(df) Output: birthdate answer YEAR MONTH-DAY zodiac_sign 0 1970-03-31 5 1970 03-31 aries … Read more

[Solved] Left align the first column and center align the other columns in a Pandas table

The table can be pretty formatted in Pandas by assembling the two missing formatting conditions into a single df. I made the following two changes to the original code. Hide index numbers with hide_index() df[[“Unit”, “Abbreviation”, “Storage”]].style.hide_index() To apply to a subset of columns, you can use the subset parameter. Left align the first column … Read more

[Solved] Removing duplicates every 5 minutes [closed]

Start from adding DatTim column (of type DateTime), taking source data from Date and Time: df[‘DatTim’] = pd.to_datetime(df.Date + ‘ ‘ + df.Time) Then, assuming that ID is an “ordinary” column (not the index), you should call: groupby on DatTim column with 5 min frequency. To each group apply drop_duplicates, with subset including only ID … Read more

[Solved] pd.DataFrame(np.random.randn(8, 4), index=dates, columns=[‘A’, ‘B’, ‘C’, ‘D’])

Basically np.random.randn returns random float values of normal distributions with mean = 0 and variance = 1. Now np.random.randn takes shape you would like to return of those distributions. For example: np.random.randn(1,2) returns an array of one row and two columns. Similarly, you can give np.random.randn(1,.,.,.,9) which gives you out a complicated array. Since you … Read more

[Solved] How to split a string without given delimeter in Panda

Assuming your split criteria is by fixed number of characters (e.g. 5 here), you can use: df[‘dfnewcolumn1’] = df[‘dfcolumn’].str[:5] df[‘dfnewcolumn2’] = df[‘dfcolumn’].str[5:] Result: dfcolumn dfnewcolumn1 dfnewcolumn2 0 PUEF2CarmenXFc034DpEd PUEF2 CarmenXFc034DpEd 1 PUEF2BalulanFc034CamH PUEF2 BalulanFc034CamH 2 CARF1BalulanFc013Baca CARF1 BalulanFc013Baca If your split criteria is by the first digit in the string, you can use: df[[‘dfnewcolumn1’, ‘dfnewcolumnX’]] … Read more

[Solved] Detect sign changes in an array [closed]

This is a simple code that resolve your problem/exemple: l = [ 1, 2, 6, -3, -2, -5, 6, 7, 1, -1, -3] for i in range(0, len(l)-1): p = l[i] * l[i+1] if p > 0: print(‘ok’) elif l[i+1] < 0: print(‘sell’) else: print(‘buy’) And if you want a new column in your dataframe, … Read more

[Solved] Extract numbers, letters, or punctuation from left side of string column in Python

Use Series.str.extract with DataFrame.pop for extract column: pat = r'([\x00-\x7F]+)([\u4e00-\u9fff]+.*$)’ df[[‘office_name’,’company_info’]] = df.pop(‘company_info’).str.extract(pat) print (df) id office_name company_info 0 1 05B01 北京企商联登记注册代理事务所(通合伙) 1 2 Unit-D 608 华夏启商(北京企业管理有限公司) 2 3 1004-1005 北京中睿智诚商业管理有限公司 3 4 17/F(1706) 北京美泰德商务咨询有限公司 4 5 A2006~A2007 北京新曙光会计服务有限公司 5 6 2906-10 中国建筑与室内设计师网 11 solved Extract numbers, letters, or punctuation from left side of string … Read more

[Solved] Loop: Results to be updated via += if on the same date, otherwise write next line

You can use DataFrame.append and then group by the column you want to use as an index. Let’s say that total_df is the table to which you want to add new rows, new_df is the table containing those new rows and date is the column to be used as index. Then you can use: total_df.append(new_df).groupby(by=’date’).sum() … Read more