How to add new column to the existing DataFrame ?
In this article, we will study how to add new column to the existing DataFrame in Python using pandas. Before this, we will quickly revise the concept of DataFrame.
Let us now create DataFrame. Before creating DataFrame we need to first import pandas. Look at the following code:
import pandas as pd d = {'Name':['Rehan','Rutik','Riya','Ram'], 'Age' :[23,45,78,34], 'Occupation':['C.A','Accountant','Content Writer','PHP Developer']} df = pd.DataFrame(d) print(df)
OUTPUT
Name Age Occupation 0 Rehan 23 C.A 1 Rutik 45 Accountant 2 Riya 78 Content Writer 3 Ram 34 PHP Developer
Updating the existing DataFrame with new column
Let us now look at ways to add new column into the existing DataFrame.
(i) DataFrame.insert()
Adding new column in our existing dataframe can be done by this method. Its syntax is as follow:
DataFrame.insert(loc, column, value, allow_duplicates = False)
- loc: loc stands for location. loc will specify the position of the column in the dataframe.
- column: column will specify the name of the column to be inserted.
- value: It is value to be inserted. It can be integer, float, string, etc.
- allow_duplicates: It will check if column with the same name exists in the dataframe or not. It will take boolean value.
Look at the following code:
df.insert(3,'Salary',30000)
OUTPUT
Name Age Occupation Salary 0 Rehan 23 C.A 30000 1 Rutik 45 Accountant 30000 2 Riya 78 Content Writer 30000 3 Ram 34 PHP Developer 30000
(ii) DataFrame.loc[row_no, column_name] = value
We can overcome the drawback seen in the above scenario by using this method. Its syntax is as follow:
DataFrame.loc[row_no, column_name] = value
- row_no: It will take the position of row.
- column_name: It will take the name of new column.
- value: It is the value that is to be updated on the mentioned position of row.
Look at the following code:
df.loc[0,'Salary'] = 30000 print(df)
OUTPUT
Name Age Occupation Salary 0 Rehan 23 C.A 30000.0 1 Rutik 45 Accountant NaN 2 Riya 78 Content Writer NaN 3 Ram 34 PHP Developer NaN
In this example, we have given position of row as 0. Hence, 3000 is inserted at position 0.
Let’s look at one more example:
df.loc[2,'Salary'] = 89000 print(df)
OUTPUT
df.loc[:,'Salary'] = 67000 print(df)
OUTPUT
Leave a Reply