[Solved] Read Excel file which has one of the column as Hyperlink through python

[ad_1] Pandas library does not have the functionality to parse hyperlink as of yet. You could preprocess the excel using libraries such as xlrd or openpxyl to get hyperlinks then use it in pandas. Source: https://stackoverflow.com/a/45446810/7570485 There is also a feature request for this functionality. https://github.com/pandas-dev/pandas/issues/13439 1 [ad_2] solved Read Excel file which has one … Read more

[Solved] Map column birthdates in python pandas df to astrology signs

[ad_1] You can apply the zodiac_sign function to the dataframe as – import pandas as pd from io import StringIO # Sample x = StringIO(“””birthdate,answer,YEAR,MONTH-DAY 1970-03-31,5,1970,03-31 1970-05-25,9,1970,05-25 1970-06-05,3,1970,06-05 1970-08-28,2,1970,08-28 “””) df = pd.read_csv(x, sep=’,’) df[‘birthdate’] = pd.to_datetime(df[‘birthdate’]) df[‘zodiac_sign’] = df[‘birthdate’].apply(lambda x: zodiac_sign(x.day, x.strftime(“%B”).lower())) print(df) Output: birthdate answer YEAR MONTH-DAY zodiac_sign 0 1970-03-31 5 1970 03-31 … Read more

[Solved] Left align the first column and center align the other columns in a Pandas table

[ad_1] The table can be pretty formatted in Pandas by assembling the two missing formatting conditions into a single df. I made the following two changes to the original code. Hide index numbers with hide_index() df[[“Unit”, “Abbreviation”, “Storage”]].style.hide_index() To apply to a subset of columns, you can use the subset parameter. Left align the first … Read more

[Solved] Removing duplicates every 5 minutes [closed]

[ad_1] Start from adding DatTim column (of type DateTime), taking source data from Date and Time: df[‘DatTim’] = pd.to_datetime(df.Date + ‘ ‘ + df.Time) Then, assuming that ID is an “ordinary” column (not the index), you should call: groupby on DatTim column with 5 min frequency. To each group apply drop_duplicates, with subset including only … Read more

[Solved] pd.DataFrame(np.random.randn(8, 4), index=dates, columns=[‘A’, ‘B’, ‘C’, ‘D’])

[ad_1] Basically np.random.randn returns random float values of normal distributions with mean = 0 and variance = 1. Now np.random.randn takes shape you would like to return of those distributions. For example: np.random.randn(1,2) returns an array of one row and two columns. Similarly, you can give np.random.randn(1,.,.,.,9) which gives you out a complicated array. Since … Read more

[Solved] How to split a string without given delimeter in Panda

[ad_1] Assuming your split criteria is by fixed number of characters (e.g. 5 here), you can use: df[‘dfnewcolumn1’] = df[‘dfcolumn’].str[:5] df[‘dfnewcolumn2’] = df[‘dfcolumn’].str[5:] Result: dfcolumn dfnewcolumn1 dfnewcolumn2 0 PUEF2CarmenXFc034DpEd PUEF2 CarmenXFc034DpEd 1 PUEF2BalulanFc034CamH PUEF2 BalulanFc034CamH 2 CARF1BalulanFc013Baca CARF1 BalulanFc013Baca If your split criteria is by the first digit in the string, you can use: df[[‘dfnewcolumn1’, … Read more

[Solved] Extract numbers, letters, or punctuation from left side of string column in Python

[ad_1] Use Series.str.extract with DataFrame.pop for extract column: pat = r'([\x00-\x7F]+)([\u4e00-\u9fff]+.*$)’ df[[‘office_name’,’company_info’]] = df.pop(‘company_info’).str.extract(pat) print (df) id office_name company_info 0 1 05B01 北京企商联登记注册代理事务所(通合伙) 1 2 Unit-D 608 华夏启商(北京企业管理有限公司) 2 3 1004-1005 北京中睿智诚商业管理有限公司 3 4 17/F(1706) 北京美泰德商务咨询有限公司 4 5 A2006~A2007 北京新曙光会计服务有限公司 5 6 2906-10 中国建筑与室内设计师网 11 [ad_2] solved Extract numbers, letters, or punctuation from left side … Read more