Index Resetting in Pandas Dataframe in Python

Post Views: 780

In this tutorial, we will solve the task of resetting the index in a Pandas Dataframe in Python language. For this, we will use reset_index().

Furthermore, we come across a term: Pandas Dataframe. Let’s first know what is Pandas. Pandas is nothing but an open-source Python library that provides different tools for working in different fields in Python programming like data analysis, finances, statistics. We use “import pandas as pd” for importing the library.

Pandas library is very common when we use Python for Data Science problems. The most common object in Pandas is called Dataframe.

Let us see more on Dataframes before we proceed with the main task.

What are Dataframes in Pandas Library?

Dataframes are 2-D mutable data structures in a tabular form, that is, it consists of rows and columns and data. These represent data in a more structured format and let us do data analysis and predictions easily with it. Moreover, the data here can be of any data type, hence dataframes are heterogeneous.

There are many ways to create dataframes. Datasets, after loaded from different storage places like CSV files, Excel files, etc, are in unstructured format and hence, are converted into Pandas Dataframe. Also, lists, arrays, dictionaries, etc can be converted into a dataframe directly. Let us see the code for it :

# import pandas
import pandas as pd 
  
# initializing data
dataset = {'Name':['Jeetu', 'Piku', 'Paro', 'Chetona', 'Rik'], 
    'Age':[25, 22, 27, 30, 29], 
    'Job':['TCS', 'Accenture', 'Amazon', 'Google', 'Capgemini'], 
    'Salary':['20000', '25000', '50000', '45000', '30000'] } 

# Convert dictionary into DataFrame 
df = pd.DataFrame(dataset) 

# print df
df

Here we converted a dictionary into a dataframe. This is the original dataset we will use for our task.

Output :

	Name	Age	Job	Salary
0	Jeetu	25	TCS	20000
1	Piku	22	Accenture	25000
2	Paro	27	Amazon	50000
3	Chetona	30	Google	45000
4	Rik	29	Capgemini	30000

How to use reset_index() for the Task ?

Our task is to reset the indexes in a Pandas Dataframe in Python. Generally resetting is required when we get a smaller dataframe from an originally huge dataframe due to some task and the original indexes are messed up and non-continuous because of that. Resetting results in continuous indexing and hence, in a more structured form of the dataframes.
Before proceeding with the coding, we need to know what does reset_index() function does. It simply does what it says in the name. It resets the index of the dataframe with a list of integers commonly or anything else input as per user choice. Let us see the syntax.

Dataframe.reset_index( level , drop , in-place , col_level , col_fil)

Approaching the Task

Approach 1 : Use new index without removing old index

To do this,

First, convert the original dictionary into a dataframe and add the index column to it. The command should look like this: pd.DataFrame(data, indexing) and store resulting dataframe in df.
Next, use command df.reset_index(in-place=True) where in-place = True means that changes are possible in original dataframe.
Print df.

# import pandas  
import pandas as pd 
  
# Define a dictionary containing employee data 
dataset = {'Name':['Jeetu', 'Piku', 'Paro', 'Chetona', 'Rik'], 
                'Age':[25, 22, 27, 30, 29], 
                'Job':['TCS', 'Accenture', 'Amazon', 'Google', 'Capgemini'], 
                'Salary':['20000', '25000', '50000', '45000', '30000'] }
index = {'a', 'b', 'c', 'd', 'e'} 

# Convert dictionary into DataFrame 
df = pd.DataFrame(dataset, index) 

# give new index
df.reset_index(inplace = True) 

df

Output :

	index	Name	Age	Job	Salary
0	e	Jeetu	25	TCS	20000
1	a	Piku	22	Accenture	25000
2	d	Paro	27	Amazon	50000
3	c	Chetona	30	Google	45000
4	b	Rik	29	Capgemini	30000

Here, you can see that both new index and default are intact.

Approach 2 : Use new index and remove old index

For this,

Just use pd.DataFrame(data, index), that is, just add a new index to the dataframe. The old index gets removed.

# import pandas 
import pandas as pd 
  
# Initialize data
dataset = {'Name':['Jeetu', 'Piku', 'Paro', 'Chetona', 'Rik'], 
             'Age':[25, 22, 27, 30, 29],
              'Job':['TCS', 'Accenture', 'Amazon', 'Google', 'Capgemini'], 
             'Salary':['20000', '25000', '50000', '45000', '30000'] }

# new index 
index = {'a', 'b', 'c', 'd', 'e'} 

# add new index
df = pd.DataFrame(dataset, index) 

df

Output :

	Name	Age	Job	Salary
e	Jeetu	25	TCS	20000
a	Piku	22	Accenture	25000
d	Paro	27	Amazon	50000
c	Chetona	30	Google	45000
b	Rik	29	Capgemini	30000

You can see that the old index is gone

Approach 3 : Reset new index and make old index as default index

For this,

Convert the given dictionary into dataframe and add the index along with it: pd.DataFrame(data, index)
Next, write the command reset_index(in-place=True, drop= True) where in-place=True means that there are changes made in the original dataframe. Moreover, the drop=True means that the new index will be dropped.

# import pandas  
import pandas as pd 
  
# initialize dataset with a dictionary
dataset = {'Name':['Jeetu', 'Piku', 'Paro', 'Chetona', 'Rik'],
                   'Age':[25, 22, 27, 30, 29], 
                   'Job':['TCS', 'Accenture', 'Amazon', 'Google', 'Capgemini'],
                   'Salary':['20000', '25000', '50000', '45000', '30000'] }
# new index 
index = {'a', 'b', 'c', 'd', 'e'} 

# Convert the dictionary into DataFrame 
df = pd.DataFrame(dataset, index) 

# remove index
df.reset_index(inplace = True, drop = True) 

df

Output :

	Name	Age	Job	Salary
0	Jeetu	25	TCS	20000
1	Piku	22	Accenture	25000
2	Paro	27	Amazon	50000
3	Chetona	30	Google	45000
4	Rik	29	Capgemini	30000

Here, you can see that the new index is removed.

Thank you for going through this article. You can check the articles below:

Index Resetting in Pandas Dataframe in Python

What are Dataframes in Pandas Library?

How to use reset_index() for the Task ?

Approaching the Task

Leave a Reply Cancel reply

Related Posts