How to Reindex and Rename Pandas Dataframe in Python
In this blog, we will learn how to re-index and rename a Pandas Dataframe in Python. After forming a Dataframe, naming the columns and giving an index to the records, one might want to re-index the Dataframe. In Pandas Dataframe, indexing originally is done in the form of 0,1,2,3 and so on.
Now suppose I want to index my records according to the data that they represent. I can do this by using index function in Pandas Dataframe, there I can specify the name of my index for different records. Now if I want to change my index because of some error made previously then reindex function can be used. Let us see through this explanation.
Re-indexing in Pandas Dataframe in Python
Let us make a Dataframe consisting of three students namely, Arun, Karan and Aman. Let us take their marks in three subjects such as Maths, Physics and Chemistry. Here the marks of students in three subjects are taken as the index. Now if I want to replace the subject Physics from the index to English then I will use the reindex function. The reindex() function with replace index Physics to English and it will also replace the data in the Physics record by NA. Here fill_value function will be used to insert value to the index English.
Steps to follow for re-indexing
We will first form a Dataframe. To know about how a Pandas Dataframe is made please click here.
- Here data on marks of Arun, Karan and Aman in various subjects are stored in the variable named as “Student_Data”
- Dataframe is accessed through Pandas where “Student_Data” is taken as the data, columns are mentioned as the name of students and subjects are mentioned as the various index. This Dataframe is stored under the variable “Table”
- Now to view the Dataframe we print Table
Code:
import pandas as pd import numpy as np Student_Data={'Arun':[11,12,14],'Karan':[9,15,14],'Aman':[12,13,12]} Table=pd.DataFrame(Student_Data,columns=["Arun","Karan","Aman"],index=["Maths","Physics","Chemistry"]) Table
Output:
Arun | Karan | Aman | |
---|---|---|---|
Maths | 11 | 9 | 12 |
Physics | 12 | 15 | 13 |
Chemistry | 14 | 14 | 12 |
Now for re-indexing we follow the following steps:
- We take the “Table” which is our Dataframe and then we apple re-index function on it.
- In the function, we specify the new index that will be replacing the old ones. Then we use function fill_value to replace the values in the old index with the new ones. For example, if Physics is re-indexed by English then all marks of all students in English will show as NA or Not Available. This will happen because the system does not has data on marks of students in English subject. With fill_value function, the value given in fill_value will be stored in replacement to “NA”
Code:
Table.reindex(["Maths","English","Chemistry"],fill_value=10)
Output:
Arun | Karan | Aman | |
---|---|---|---|
Maths | 11 | 9 | 12 |
English | 10 | 10 | 10 |
Chemistry | 14 | 14 | 12 |
Here as we can see that English has been included as an index in place of Physics. Also, the marks of students have been replaced by 10, the number that we had given in the function fill_value.
Renaming axis in Python
Let us get to the second part of our objective, renaming the axis in Python. Now taking the same example forward, the Dataframe “Table” does not give me a clear picture of what my rows and columns represent. How do I give a name to my columns and rows so that my Dataframe is well defined? Here rename_axis() function plays an important role.
Steps to rename axis in Python
- We first use the function rename_axis on Dataframe “Table” and give the name “Subject”. Python will automatically assume that “Subject” is the name for rows or indexes. We save this in a variable named “New_Table”
- Now we take “New_table” and apply the function of rename_axis() on it. Here Student_Name is taken and axis is mentioned as columns. Through this, the system will get to know that “Student_Name” is for columns and not for rows
Code:
New_Table=Table.rename_axis("Subject") New_Table
Output:
Arun | Karan | Aman | |
---|---|---|---|
Subject | |||
Maths | 11 | 9 | 12 |
Physics | 12 | 15 | 13 |
Chemistry | 14 | 14 | 12 |
Code for renaming column axis:
New_Table.rename_axis("Student_Name",axis="columns")
Output:
Student_Name | Arun | Karan | Aman |
---|---|---|---|
Subject | |||
Maths | 11 | 9 | 12 |
Physics | 12 | 15 | 13 |
Chemistry | 14 | 14 | 12 |
As we can see in the output, through this code we have named the two axis of the Dataframe and now it is clear what the rows and columns mean. This makes the Table easy to interpret and manipulate further.
i can’t understand this language programiming