[Solved] How to split a string without given delimeter in Panda


Assuming your split criteria is by fixed number of characters (e.g. 5 here), you can use:

df['dfnewcolumn1'] = df['dfcolumn'].str[:5]
df['dfnewcolumn2'] = df['dfcolumn'].str[5:]

Result:

                dfcolumn dfnewcolumn1      dfnewcolumn2
0  PUEF2CarmenXFc034DpEd        PUEF2  CarmenXFc034DpEd
1  PUEF2BalulanFc034CamH        PUEF2  BalulanFc034CamH
2  CARF1BalulanFc013Baca        CARF1  BalulanFc013Baca

If your split criteria is by the first digit in the string, you can use:

df[['dfnewcolumn1', 'dfnewcolumnX']] = df['dfcolumn'].str.split(r'(?<=\d)\D', n=1, expand=True)
df[['dfnewcolumnX', 'dfnewcolumn2']] = df['dfcolumn'].str.split(r'\D*\d', n=1, expand=True)
df = df.drop(columns="dfnewcolumnX")

Using the following modified original data with more test cases:

                 dfcolumn
0   PUEF2CarmenXFc034DpEd
1   PUEF2BalulanFc034CamH
2   CARF1BalulanFc013Baca
3    CAF1BalulanFc013Baca
4  PUEFA2BalulanFc034CamH

Run code:

df[['dfnewcolumn1', 'dfnewcolumnX']] = df['dfcolumn'].str.split(r'(?<=\d)\D', n=1, expand=True)
df[['dfnewcolumnX', 'dfnewcolumn2']] = df['dfcolumn'].str.split(r'\D*\d', n=1, expand=True)
df = df.drop(columns="dfnewcolumnX")

Result:

                 dfcolumn dfnewcolumn1      dfnewcolumn2
0   PUEF2CarmenXFc034DpEd        PUEF2  CarmenXFc034DpEd
1   PUEF2BalulanFc034CamH        PUEF2  BalulanFc034CamH
2   CARF1BalulanFc013Baca        CARF1  BalulanFc013Baca
3    CAF1BalulanFc013Baca         CAF1  BalulanFc013Baca
4  PUEFA2BalulanFc034CamH       PUEFA2  BalulanFc034CamH

2

solved How to split a string without given delimeter in Panda