How to rename columns in Pandas DataFrame?
In this article, we will study how to rename columns in Pandas DataFrame using Python. Let’s understand first what is Pandas and DataFrame.
Pandas is an opensource library that is provided by Python. Pandas perform data manipulation and data analysis.
DataFrame is a two-dimensional Data Structure. Data in DataFrame is aligned in a tabular fashion in rows and columns. Hence, DataFrame is used to store data.
Rename Columns in Pandas DataFrame
Step 1: Import Pandas
Importing Pandas is the first step for using DataFrame. Following is the code:
import pandas as pd
Step 2: Create DataFrame
Since we are learning how to rename columns of DataFrame, we need to create DataFrame.
details = {'Name' : ['Rani','Teju','Bhushan','Roshan'], 'Age' : [29,26,34,67], 'Salary' : [23000,67000,80000,56000], 'Designation' : ['C.A','Accountant','Data Scientist','Data Analyst']} df = pd.DataFrame(details) print(df)
OUTPUT
Name Age Salary Designation 0 Rani 29 23000 C.A 1 Teju 26 67000 Accountant 2 Bhushan 34 80000 Data Scientist 3 Roshan 67 56000 Data Analyst
Different Techniques used to rename columns of DataFrame:
(i) DataFrame.rename()
rename() is the method of pandas.DataFrame. It changes the name of rows and columns individually. It takes Dataframe name. The “index” renames rows and “column” renames columns. If we want to change the name of column, following is the code:
df.rename(columns = {'Name':'NAME'})
OUTPUT
NAME Age Salary Designation 0 Rani 29 23000 C.A 1 Teju 26 67000 Accountant 2 Bhushan 34 80000 Data Scientist 3 Roshan 67 56000 Data Analyst
rename() can also take more than one column. Let’s look at following code.
df.rename(columns = {'Age':'AGE','Salary':'SALARY'})
OUTPUT
NAME AGE SALARY Designation 0 Rani 29 23000 C.A 1 Teju 26 67000 Accountant 2 Bhushan 34 80000 Data Scientist 3 Roshan 67 56000 Data Analyst
(ii) By passing list of columns
In this method, we will pass the new column names into list. This list would be assigned to the column attribute of DataFrame. Let’s look at the following code:
df.columns = ['Name','Age','Income','Occupation'] print(df.columns)
OUTPUT
Index(['Name', 'Age', 'Income', 'Occupation'], dtype='object')
Let’s print the dataframe with new column names:
print(df)
OUTPUT:
Name Age Income Occupation 0 Rani 29 23000 C.A 1 Teju 26 67000 Accountant 2 Bhushan 34 80000 Data Scientist 3 Roshan 67 56000 Data Analyst
Let us now try to update any one column name instead of all columns. Look at the following code:
df.columns = ['Income']
If we run the above code, it will raise an ValueError. It says :
ValueError: Length mismatch: Expected axis has 4 elements, new values have 1 elements
Hence, this method has one drawback. It will accept all columns even if we want to update few columns.
(iii) Using axis = 1
DataFrame is a two dimensional Data Structure. It has rows and columns. axis = 1 indicates column and axis = 0 indicates row. If we want to update column name, it is done using axis = 1. Let’s look at the following code:
df.rename({'Income':'Salary','Occupation':'Designation'},axis = 1)
OUTPUT
Name Age Salary Designation 0 Rani 29 23000 C.A 1 Teju 26 67000 Accountant 2 Bhushan 34 80000 Data Scientist 3 Roshan 67 56000 Data Analyst
(iv) Using axis = “columns”
Updating name of the column can also be done by setting axis parameter to “columns”. This can be done using axis = “columns”. Let’s look at the following code:
df.rename({'Salary':'Payment'},axis = "columns")
OUTPUT
Name Age Payment Designation 0 Rani 29 23000 C.A 1 Teju 26 67000 Accountant 2 Bhushan 34 80000 Data Scientist 3 Roshan 67 56000 Data Analyst
Thank You.
You may also read: Filter rows of DataFrame in Python?
Leave a Reply