Exclude particular column in Dataframe in Python

In this article, we will study how to exclude a particular column in Dataframe using Python.

Let us create DataFrame. For this, we first need to import Pandas. Pandas is an open source Python library. It allows us to create and manipulate data. Look at the following code:

import pandas as pd

details_of_employee = {"Name":["Ruchita","Avni","Deepak","Vish"],
                      "Age" :[23,45,21,39],
                      "Designation":["C.A","PHP Developer","Android Developer","Data Scientist"],
                      "Salary":[34000,45000,56000,89000],
                      "Experience":[2,3,6,7]}

df = pd.DataFrame(details_of_employee)

print(df)

OUTPUT

Name
Age
Designation
Salary
Experience
0
Ruchita
23
C.A
34000
2
1
Avni
45
PHP Developer
45000
3
2
Deepak
21
Android Developer
56000
6
3
Vish
39
Data Scientist
89000
7

We will perform all the operations on this DataFrame.

Exclude particular column from a DataFrame in Python

Let us now look at ways to exclude particluar column of pandas dataframe using Python.

(i) dataframe.columns.difference()

The dataframe.columns.difference() provides the difference of the values which we pass as arguments. It excludes particular column from the existing dataframe and creates new dataframe. Look at the following code:

new_df = df[df.columns.difference(['Experience'])]
print(new_df)

OUTPUT

Age
Designation
Name
Salary
0
23
C.A
Ruchita
34000
1
45
PHP Developer
Avni
45000
2
21
Android Developer
Deepak
56000
3
39
Data Scientist
Vish
89000

In this case, we have passed the column “Experience” as an argument. Hence, a new dataframe is created by excluding “Experience” column.

(ii) dataframe.columns != ‘column_name’

The dataframe.columns != ‘column_name’ excludes the column which is passed to “column_name”. This can be achieved using dataframe.loc. This function access group of rows and columns respectively. Look at the following code:

new_df = df.loc[:, df.columns != 'Age']
print(new_df)

OUTPUT

Name
Designation
Salary
Experience
0
Ruchita
C.A
34000
2
1
Avni
PHP Developer
45000
3
2
Deepak
Android Developer
56000
6
3
Vish
Data Scientist
89000
7

DataFrame.loc takes rows and column respectively. In this case, “:” indicates all rows and df.columns != ‘Age’ indicates all columns except “Age”. Hence, a new dataframe is created by excluding “Age” column.

(iii) ~dataframe.columns.isin([‘column_name’])

The dataframe.columns.isin() selects the columns which are passed into the function as an argument. Therefore, ~dataframe.columns.isin() will exclude the column which is passed as an argument and it will select rest of the columns. This can be achieved using dataframe.loc. Look at the following code:

new_df = df.loc[:, ~df.columns.isin(['Salary'])]
print(new_df)

OUTPUT

Name
Age
Designation
Experience
0
Ruchita
23
C.A
2
1
Avni
45
PHP Developer
3
2
Deepak
21
Android Developer
6
3
Vish
39
Data Scientist
7

DataFrame.loc takes rows and column respectively. In this case, “:” indicates all rows and ~df.columns.isin([‘Salary’]) indicates all columns except “Salary”. Hence, a new dataframe is created by excluding “Salary” column.

In this way, we can exclude particular column from DataFrame using Python.

Thank You.

You may also read: How to convert DataFrame into List using Python?

Leave a Reply

Your email address will not be published. Required fields are marked *