[Solved] Compare two dataframes and update the dataframe if the data is different [closed]

Question

If I understand the logic correctly . . .

# imports
import pandas as pd
from io import StringIO

# sample data
s1 = """id Name score
111 Jack 2.17
112 Nick 1.11
113 Zoe 4.12"""

s2 = """id Name score
111 Jack 2.17
112 Sick 1.10
113 Zoe 4.12
114 Jay 12.3"""

df1 = pd.read_csv(StringIO(s1), sep=' ')
df2 = pd.read_csv(StringIO(s2), sep=' ')

# use boolean indexing with isin
# new id in df2 that is not in df1
df3 = df2[~df2['id'].isin(df1['id'])]
# where id is the same but name or score is different
# also does not include the data where id is in the newly created df3
df4 = df2[~(df2[['Name', 'score']].isin(df1[['Name', 'score']]).any(1) | df2['id'].isin(df3['id']))]

df3

    id Name  score
3  114  Jay   12.3

df4

    id  Name  score
1  112  Sick    1.1

Accepted Answer

If I understand the logic correctly . . .

# imports
import pandas as pd
from io import StringIO

# sample data
s1 = """id Name score
111 Jack 2.17
112 Nick 1.11
113 Zoe 4.12"""

s2 = """id Name score
111 Jack 2.17
112 Sick 1.10
113 Zoe 4.12
114 Jay 12.3"""

df1 = pd.read_csv(StringIO(s1), sep=' ')
df2 = pd.read_csv(StringIO(s2), sep=' ')

# use boolean indexing with isin
# new id in df2 that is not in df1
df3 = df2[~df2['id'].isin(df1['id'])]
# where id is the same but name or score is different
# also does not include the data where id is in the newly created df3
df4 = df2[~(df2[['Name', 'score']].isin(df1[['Name', 'score']]).any(1) | df2['id'].isin(df3['id']))]

df3

    id Name  score
3  114  Jay   12.3

df4

    id  Name  score
1  112  Sick    1.1