IIUC, this is a classical groupby
+agg
. You need to set the dates to a datetime
type for meaningful comparisons:
(df
.assign(DATE_1=pd.to_datetime(df['DATE_1']),
DATE_2=pd.to_datetime(df['DATE_2'])
)
.groupby('ID')
.agg({'DATE_1': 'min', 'DATE_2': 'max'})
)
output:
DATE_1 DATE_2
ID
12 2012-01-01 2021-01-01
13 2010-01-01 2021-01-01
14 2012-01-01 2021-01-01
0
solved Group by ID and select rows with the newest DATE1 and the oldest DATE2