Index Resetting in Pandas Dataframe in Python

In this tutorial, we will solve the task of resetting the index in a Pandas Dataframe in Python language. For this, we will use reset_index().

Furthermore, we come across a term: Pandas Dataframe. Let’s first know what is PandasPandas is nothing but an open-source Python library that provides different tools for working in different fields in Python programming like data analysis, finances, statistics. We use “import pandas as pd” for importing the library.

Pandas library is very common when we use Python for Data Science problems. The most common object in Pandas is called Dataframe.

Let us see more on Dataframes before we proceed with the main task.

What are Dataframes in Pandas Library?

Dataframes are 2-D mutable data structures in a tabular form, that is, it consists of rows and columns and data. These represent data in a more structured format and let us do data analysis and predictions easily with it. Moreover, the data here can be of any data type, hence dataframes are heterogeneous.

There are many ways to create dataframes. Datasets, after loaded from different storage places like CSV files, Excel files, etc, are in unstructured format and hence, are converted into Pandas Dataframe. Also, lists, arrays, dictionaries, etc can be converted into a dataframe directly. Let us see the code for it :

# import pandas
import pandas as pd 
  
# initializing data
dataset = {'Name':['Jeetu', 'Piku', 'Paro', 'Chetona', 'Rik'], 
    'Age':[25, 22, 27, 30, 29], 
    'Job':['TCS', 'Accenture', 'Amazon', 'Google', 'Capgemini'], 
    'Salary':['20000', '25000', '50000', '45000', '30000'] } 

# Convert dictionary into DataFrame 
df = pd.DataFrame(dataset) 

# print df
df 

Here we converted a dictionary into a dataframe. This is the original dataset we will use for our task.

Output :

Name       Age       Job    Salary
0       Jeetu      25             TCS  20000
1         Piku      22   Accenture  25000
2        Paro      27       Amazon  50000
3  Chetona      30         Google   45000
4          Rik      29   Capgemini   30000

How to use reset_index() for the Task ?

Our task is to reset the indexes in a Pandas Dataframe in Python. Generally resetting is required when we get a smaller dataframe from an originally huge dataframe due to some task and the original indexes are messed up and non-continuous because of that. Resetting results in continuous indexing and hence, in a more structured form of the dataframes.
Before proceeding with the coding, we need to know what does reset_index() function does. It simply does what it says in the name. It resets the index of the dataframe with a list of integers commonly or anything else input as per user choice. Let us see the syntax.

Dataframe.reset_index( level , drop , in-place , col_level , col_fil)

Approaching the Task

Approach 1 : Use new index without removing old index

To do this,

  • First, convert the original dictionary into a dataframe and add the index column to it. The command should look like this: pd.DataFrame(data, indexing) and store resulting dataframe in df.
  • Next, use command df.reset_index(in-place=True) where in-place = True means that changes are possible in original dataframe.
  • Print df.
# import pandas  
import pandas as pd 
  
# Define a dictionary containing employee data 
dataset = {'Name':['Jeetu', 'Piku', 'Paro', 'Chetona', 'Rik'], 
                'Age':[25, 22, 27, 30, 29], 
                'Job':['TCS', 'Accenture', 'Amazon', 'Google', 'Capgemini'], 
                'Salary':['20000', '25000', '50000', '45000', '30000'] }
index = {'a', 'b', 'c', 'd', 'e'} 

# Convert dictionary into DataFrame 
df = pd.DataFrame(dataset, index) 

# give new index
df.reset_index(inplace = True) 

df 

Output :

index    Name    Age    Job    Salary
0   e        Jeetu    25             TCS   20000
1   a          Piku    22   Accenture   25000
2   d          Paro    27      Amazon   50000
3   c   Chetona    30        Google   45000
4   b            Rik    29  Capgemini   30000

Here, you can see that both new index and default are intact.

Approach 2 : Use new index and remove old index

For this,

  • Just use pd.DataFrame(data, index), that is, just add a new index to the dataframe. The old index gets removed.
# import pandas 
import pandas as pd 
  
# Initialize data
dataset = {'Name':['Jeetu', 'Piku', 'Paro', 'Chetona', 'Rik'], 
             'Age':[25, 22, 27, 30, 29],
              'Job':['TCS', 'Accenture', 'Amazon', 'Google', 'Capgemini'], 
             'Salary':['20000', '25000', '50000', '45000', '30000'] }

# new index 
index = {'a', 'b', 'c', 'd', 'e'} 

# add new index
df = pd.DataFrame(dataset, index) 

df 

Output :

  Name     Age      Job   Salary
e         Jeetu     25                TCS   20000
a           Piku     22     Accenture   25000
d           Paro     27         Amazon   50000
c    Chetona     30            Google   45000
b             Rik     29   Capgemini   30000

You can see that the old index is gone

Approach 3 : Reset new index and make old index as default index

For this,

  • Convert the given dictionary into dataframe and add the index along with it: pd.DataFrame(data, index)
  • Next, write the command reset_index(in-place=True, drop= True) where in-place=True means that there are changes made in the original dataframe. Moreover, the drop=True means that the new index will be dropped.
# import pandas  
import pandas as pd 
  
# initialize dataset with a dictionary
dataset = {'Name':['Jeetu', 'Piku', 'Paro', 'Chetona', 'Rik'],
                   'Age':[25, 22, 27, 30, 29], 
                   'Job':['TCS', 'Accenture', 'Amazon', 'Google', 'Capgemini'],
                   'Salary':['20000', '25000', '50000', '45000', '30000'] }
# new index 
index = {'a', 'b', 'c', 'd', 'e'} 

# Convert the dictionary into DataFrame 
df = pd.DataFrame(dataset, index) 

# remove index
df.reset_index(inplace = True, drop = True) 

df 

Output :

    Name       Age       Job    Salary
0            Jeetu      25                 TCS  20000
1              Piku      22       Accenture  25000
2              Paro      27           Amazon  50000
3       Chetona      30             Google   45000
4                 Rik      29      Capgemini   30000

Here, you can see that the new index is removed.

Thank you for going through this article. You can check the articles below:

Leave a Reply

Your email address will not be published. Required fields are marked *