Join Two DataFrames in Pandas with Python

In this tutorial, you will learn how to join 2 different DataFrames in pandas using Python.

A DataFrame can be called a Table or a 2 Dimensional Array data structure in which each column contains values of one variable and each row contains a set of values from each column.

You may read: How to create 2D array from list of lists in Python

In order to show you how to join two DataFrames in Pandas with Python, we need to have two DataFrames.

First, let’s create 2 custom Dataframes. Syntax has been given below
DataFrame 1:

import pandas as pd
data1 = {
        'id': ['1', '2', '3', '4', '5'],
        
        'Name': ['Alex', 'Ben', 'Chetan', 'Dinesh', 'Ethan']}
d1 = pd.DataFrame(data1, columns = ['id','Name'])
print(d1)

Output

Join Two DataFrames in Pandas with Python


Dataframe 2:

import pandas as pd
data2 = {
        'id': ['4', '5', '8', '9', '10'],
        
        'Name': ['Felix', 'Chetan', 'Alex', 'Deepak', 'John']}
d2 = pd.DataFrame(data2, columns = ['id','Name'])
print(d2)

Output

2nd Dataframe

Here data1 and data2 are dictionaries and the values in the lists are the corresponding data to each row. To convert these dictionaries into a DataFrame we use the Dataframe() function and the columns argument to name the respective columns.

To join DataFrame we use the merge() function and on argument. This on argument is used to specify on which column do we need to join the DataFrame.

There are 4 ways in which we can join 2 data frames. These are:

  • Inner Join
  • Right Join
  • Left Join
  • Outer Join

Inner Join of two DataFrames in Pandas

Inner Join produces a set of data that are common in both DataFrame 1 and DataFrame 2.We use the merge() function and pass inner in how argument.

df_inner = pd.merge(d1, d2, on='id', how='inner')

print(df_inner)

Output

Inner Join of two DataFrames in Pandas

Right Join of two DataFrames in Pandas

Right Join produces all the data from DataFrame 2 with those data that are matching in DataFrame 1. If there are no common data then that data will contain Nan (null). We use the merge() function and pass right in how argument.

df_right = pd.merge(d1, d2, on='id', how='right')
print(df_right)

Output

Right Join of two DataFrames in Pandas

Left Join of two DataFrames in Pandas

Left Join produces all the data from DataFrame 1 with the common records in DataFrame 2. If there are no common data then that data will contain Nan (null). We use the merge() function and pass left in how argument.

df_left = pd.merge(d1, d2, on='id', how='left')
print(df_left)

Output

Left Join of two DataFrames in Pandas

Outer Join of two DataFrames in Pandas

Outer Join combines both the data of the DataFrame 1 and DataFrame 2 and for all those data which are not common NaN’s will be filled. We use the merge() function and pass outer in how argument.

df_outer = pd.merge(d1, d2, on='id', how='outer')
print(df_outer)

Output

Outer Join of two DataFrames in Pandas

Leave a Reply

Your email address will not be published. Required fields are marked *