Comparing Pandas DataFrame Objects

Current Location：Home > Learning > PROGRAM > Python >

Python PHP Java Go TypeScript C++ Vba Node.js C语言 MATLAB

Comparing Pandas DataFrame Objects

Author：JIYIK Last Updated：2025/05/02 Views：

This tutorial explains how to compare Pandas DataFrame objects in Python. We can use ==the operator to compare DataFrames.

import pandas as pd

data_season1 = {
    "Player": ["Lewandowski", "Haland", "Ronaldo", "Messi", "Mbappe"],
    "Goals": [10, 8, 6, 5, 4],
}

data_season2 = {
    "Player": ["Lewandowski", "Haland", "Ronaldo", "Messi", "Mbappe"],
    "Goals": [7, 8, 6, 7, 4],
}

df_1 = pd.DataFrame(data_season1)
df_2 = pd.DataFrame(data_season2)

print("df_1:")
print(df_1)

print("")

print("df_2:")
print(df_2)

Output:

df_1:
        Player  Goals
0  Lewandowski     10
1       Haland      8
2      Ronaldo      6
3        Messi      5
4       Mbappe      4

df_2:
        Player  Goals
0  Lewandowski      7
1       Haland      8
2      Ronaldo      6
3        Messi      7
4       Mbappe      4

In this article, we will use DataFrame df_1and df_2to demonstrate the comparison of DataFrame.

`==`Comparing Pandas DataFrame Objects Using the Operator

import pandas as pd

data_season1 = {
    "Player": ["Lewandowski", "Haland", "Ronaldo", "Messi", "Mbappe"],
    "Goals": [10, 8, 6, 5, 4],
}

data_season2 = {
    "Player": ["Lewandowski", "Haland", "Ronaldo", "Messi", "Mbappe"],
    "Goals": [7, 8, 6, 7, 4],
}

df_1 = pd.DataFrame(data_season1)
df_2 = pd.DataFrame(data_season2)

print(df_1 == df_2)

Output:

   Player  Goals
0    True  False
1    True   True
2    True   True
3    True  False
4    True   True

Compares df_1corresponding df_2elements of and and returns if the corresponding elements at that position are the same, Trueotherwise returns False.

We can use pandas.DataFrame.all()the method to find out which rows in df_1and df_2are the same.

import pandas as pd

data_season1 = {
    "Player": ["Lewandowski", "Haland", "Ronaldo", "Messi", "Mbappe"],
    "Goals": [10, 8, 6, 5, 4],
}

data_season2 = {
    "Player": ["Lewandowski", "Haland", "Ronaldo", "Messi", "Mbappe"],
    "Goals": [7, 8, 6, 7, 4],
}

df_1 = pd.DataFrame(data_season1)
df_2 = pd.DataFrame(data_season2)

print((df_1 == df_2).all(axis=1))

Output:

0    False
1     True
2     True
3    False
4     True
dtype: bool

In the output, Truethe rows with value are the same as the corresponding element value. Therefore, Falsethe rows with output value are different from the corresponding element value.

We can use the index to list all rows where the values of df_1and df_2are different.

import pandas as pd

data_season1 = {
    "Player": ["Lewandowski", "Haland", "Ronaldo", "Messi", "Mbappe"],
    "Goals": [10, 8, 6, 5, 4],
}

data_season2 = {
    "Player": ["Lewandowski", "Haland", "Ronaldo", "Messi", "Mbappe"],
    "Goals": [7, 8, 6, 7, 4],
}

df_1 = pd.DataFrame(data_season1)
df_2 = pd.DataFrame(data_season2)

print(df_1[(df_1 == df_2).all(axis=1) == False])

Output:

        Player  Goals
0  Lewandowski     10
3        Messi      5

It lists df_1all the rows in that have values df_2that differ from the corresponding rows in .

If we had different indexes for df_1and , we would get an error saying .df_2ValueError: Can only compare identically-labeled DataFrame objects

import pandas as pd

data_season1 = {
    "Player": ["Lewandowski", "Haland", "Ronaldo", "Messi", "Mbappe"],
    "Goals": [10, 8, 6, 5, 4],
}

data_season2 = {
    "Player": ["Lewandowski", "Haland", "Ronaldo", "Messi", "Mbappe"],
    "Goals": [7, 8, 6, 7, 4],
}

df_1 = pd.DataFrame(data_season1)
df_2 = pd.DataFrame(data_season2, index=["a", "b", "c", "d", "e"])

print(df_1 == df_2)

Output:

Traceback (most recent call last):
...
ValueError: Can only compare identically-labeled DataFrame objects

We can use pandas.DataFrame.reset_index()the method to reset the index to overcome the above problem.

import pandas as pd

data_season1 = {
    "Player": ["Lewandowski", "Haland", "Ronaldo", "Messi", "Mbappe"],
    "Goals": [10, 8, 6, 5, 4],
}

data_season2 = {
    "Player": ["Lewandowski", "Haland", "Ronaldo", "Messi", "Mbappe"],
    "Goals": [7, 8, 6, 7, 4],
}

df_1 = pd.DataFrame(data_season1)
df_2 = pd.DataFrame(data_season2, index=["a", "b", "c", "d", "e"])
df_2.reset_index(drop=True, inplace=True)

print(df_1 == df_2)

Output:

   Player  Goals
0    True  False
1    True   True
2    True   True
3    True  False
4    True   True

It resets the index of before comparing df_1and so that both DataFrames have the same index, making comparison possible.df_2df_2

You also have to make sure you have the same number of rows in your DataFrames before comparing them.

Previous：Differences between Pandas apply, map and applymap

Next：Splitting a Pandas DataFrame

For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.

Article URL：

JIYIK CN >

Comparing Pandas DataFrame Objects

`==`Comparing Pandas DataFrame Objects Using the Operator

Related Articles

How to Convert DataFrame Column to String in Pandas

How to count the frequency of values in a Pandas DataFrame

How to get value from Pandas DataFrame cell

How to Add a Row to a Pandas DataFrame

How to change the order of Panas DataFrame columns

How to pretty print an entire Pandas Series/DataFrame

How to count the number of NaN occurrences in a Pandas Dataframe column

How to Convert a Pandas Dataframe to a NumPy Array

How to add a header row to a Pandas DataFrame

Scan to Read All Tech Tutorials

Social Media

 https://www.github.com/onmpw

 qq:1244347461



Recommended

Tags

Comparing Pandas DataFrame Objects

==Comparing Pandas DataFrame Objects Using the Operator

Related Articles

Scan to Read All Tech Tutorials

Social Media  https://www.github.com/onmpw  qq:1244347461 

Recommended

Tags

`==`Comparing Pandas DataFrame Objects Using the Operator

Social Media

 https://www.github.com/onmpw

 qq:1244347461

