How to add new column to the existing DataFrame ?

Post Views: 1,091

In this article, we will study how to add new column to the existing DataFrame in Python using pandas. Before this, we will quickly revise the concept of DataFrame.

Let us now create DataFrame. Before creating DataFrame we need to first import pandas. Look at the following code:

import pandas as pd

d = {'Name':['Rehan','Rutik','Riya','Ram'],
     'Age' :[23,45,78,34],
     'Occupation':['C.A','Accountant','Content Writer','PHP Developer']}

df = pd.DataFrame(d)

print(df)

OUTPUT

    Name    Age    Occupation
0   Rehan   23     C.A
1   Rutik   45     Accountant
2   Riya    78     Content Writer
3   Ram     34     PHP Developer

Updating the existing DataFrame with new column

Let us now look at ways to add new column into the existing DataFrame.

(i) DataFrame.insert()

Adding new column in our existing dataframe can be done by this method. Its syntax is as follow:

DataFrame.insert(loc, column, value, allow_duplicates = False)

loc: loc stands for location. loc will specify the position of the column in the dataframe.
column: column will specify the name of the column to be inserted.
value: It is value to be inserted. It can be integer, float, string, etc.
allow_duplicates: It will check if column with the same name exists in the dataframe or not. It will take boolean value.

Look at the following code:

df.insert(3,'Salary',30000)

OUTPUT

    Name  Age      Occupation Salary
0  Rehan   23             C.A  30000
1  Rutik   45      Accountant  30000
2   Riya   78  Content Writer  30000
3    Ram   34   PHP Developer  30000

(ii) DataFrame.loc[row_no, column_name] = value

We can overcome the drawback seen in the above scenario by using this method. Its syntax is as follow:

DataFrame.loc[row_no, column_name] = value

row_no: It will take the position of row.
column_name: It will take the name of new column.
value: It is the value that is to be updated on the mentioned position of row.

Look at the following code:

df.loc[0,'Salary'] = 30000

print(df)

OUTPUT

     Name    Age   Occupation      Salary
0    Rehan   23    C.A             30000.0
1    Rutik   45    Accountant      NaN
2    Riya    78    Content Writer  NaN
3    Ram     34    PHP Developer   NaN

In this example, we have given position of row as 0. Hence, 3000 is inserted at position 0.

Let’s look at one more example:

df.loc[2,'Salary'] = 89000 
print(df)

OUTPUT

    Name  Age      Occupation   Salary
0  Rehan   23             C.A  30000.0
1  Rutik   45      Accountant      NaN
2   Riya   78  Content Writer  89000.0
3    Ram   34   PHP Developer      NaN

If we want to insert same values in all rows, then we will do this using following way:

df.loc[:,'Salary'] = 67000
print(df)

OUTPUT

    Name  Age      Occupation  Salary
0  Rehan   23             C.A   67000
1  Rutik   45      Accountant   67000
2   Riya   78  Content Writer   67000
3    Ram   34   PHP Developer   67000

(iii) DataFrame.assign()

DataFrame.assign() allows us to insert new column into an existing DataFrame. Its syntax is as follow:

DataFrame.assign(column_name = list of values)

column_name: It is the name of the new column.
list of values: These are the values to be inserted in new column.

Look at the following code:

df.assign(Experience =[3,3,2,7])

print(df)

OUTPUT

    Name  Age      Occupation  Salary  Experience
0  Rehan   23             C.A   67000           3
1  Rutik   45      Accountant   67000           3
2   Riya   78  Content Writer   67000           2
3    Ram   34   PHP Developer   67000           7

Thank You.

You may also learn: How to rename columns in Pandas DataFrame?

How to add new column to the existing DataFrame ?

Updating the existing DataFrame with new column

(i) DataFrame.insert()

(ii) DataFrame.loc[row_no, column_name] = value

(iii) DataFrame.assign()

Leave a Reply Cancel reply

Related Posts