Map External Values to Dataframe in Python | Pandas

Post Views: 707

In this tutorial, we are going to solve the task of mapping external values to the Pandas dataframe in Python language. We will show many ways to do this.

We see here a term called Pandas Dataframe. Pandas is an open-source library that helps in performing analysis and statistical tasks in Python language. It has to be imported into the code. Dataframes are objects of the Pandas library.

Before proceeding further, let us know more about Dataframes.

What are Dataframes?

Dataframes are mutable data structures that contain three key elements: rows, columns, and data. Further, the use of data frames include making unstructured data more clear and structured so that tasks can be performed on them. Any kind of arithmetic operation is possible on the rows and columns. Let’s see the syntax for this :

pd.DataFrame(data , index , columns , dtype , copy)

Let’s see a code now for this :

# import pandas
import pandas as pd 
  
# initializing data
dataset = {'Name':['Monica', 'Phoebe', 'Ross', 'Chandler', 'Rachel', 'Joey'], 
    'Age':[26, 23, 30, 28, 25, 29], 
    'Address':['Kolkata', 'Chennai', 'Agra', 'Mumbai', 'Delhi', 'Lucknow'] }

# Convert dictionary into DataFrame 
df = pd.DataFrame(dataset) 

# print df
df

Output :

	Name	Age	Address
0	Monica	26	Kolkata
1	Phoebe	23	Chennai
2	Ross	30	Agra
3	Chandler	28	Mumbai
4	Rachel	25	Delhi
5	Joey	29	Lucknow

We will use this dataset for our task.

Different Approaches to the Task

There are many approaches for this. Here, we will discuss two out of them :

Approach 1 : Using map() function

For this,

Convert the dictionary dataset into a Pandas dataframe and add the column names. It should look like this: pd.DataFrame( dataset, columns= [‘Name’, ‘Age’, ‘Address’]).
Take external column as input in col.
Now, use map() function to join the new column col to the dataframe according to the Name column. The command will look like this : df[“Employment”] = df[“Name”].map(col).

# Creating dataframe 
import pandas as pd 

dataset = {'Name':['Monica', 'Phoebe', 'Ross', 'Chandler', 'Rachel', 'Joey'],
                'Age':[26, 23, 30, 28, 25, 29], 
               'Address':['Kolkata', 'Chennai', 'Agra', 'Mumbai', 'Delhi', 'Lucknow'] }

df = pd.DataFrame(dataset, columns = ['Name', 'Age', 'Address']) 

# new column for dataframe
col = { "Monica":"PWC", 
      "Phoebe":"Cognizant", 
      "Ross":"Microsoft", 
      "Chandler":"Apple", 
      "Rachel":"Philips",
       "Joey":"Samsung" } 

# combine this new data with existing DataFrame 
df["Employment"] = df["Name"].map(col) 

print(df)

Output :

	Name	Age	Address	Employment
0	Monica	26	Kolkata	PWC
1	Phoebe	23	Chennai	Cognizant
2	Ross	30	Agra	Microsoft
3	Chandler	28	Mumbai	Apple
4	Rachel	25	Delhi	Philips
5	Joey	29	Lucknow	Samsung

Here, you can see that the column ‘Employment‘ has been added.

Approach 2 : Using replace() function

Now, replace() is a Python function that is used to replace a part of a string with another portion and print a new entire copy. This is what we will be doing here; we will replace a portion of a string with external values. For this :

Convert the dictionary dataset into dictionary
Create a new column col with the strings to be replaced and strings that will replace them.
Use the replace() function to replace the old string and write the strings which will replace them in the parameters. The command will look like this: df.replace({“Name”:col}).

# Create dataframe 
import pandas as pd 
dataset = {'Name':['Monica', 'Phoebe', 'Ross', 'Chandler', 'Rachel', 'Joey'], 
                  'Age':[26, 23, 30, 28, 25, 29], 
                  'Address':['Kolkata', 'Chennai', 'Agra', 'Mumbai', 'Delhi', 'Lucknow'] }
df = pd.DataFrame(initial_data, columns = ['Name', 'Age', 'Address']

# Create new column
col = { "Monica":"Richard", 
  "Ross":"Carol", 
  "Joey":"Kathy" }
# replace with external values
df = df.replace({"Name":col}) 
print(df)

Output :

	Name	Age	Address
0	Richard	26	Kolkata
1	Phoebe	23	Chennai
2	Carol	30	Agra
3	Chandler	28	Mumbai
4	Rachel	25	Delhi
5	Kathy	29	Lucknow

Here you can see that ‘Monica‘, ‘Ross‘ and ‘Joey‘ has been replaced by ‘Richard‘, ‘Carol‘ and ‘Kathy‘ respectively.

Thank you for going through this article. You can check the related articles below :

Map External Values to Dataframe in Python | Pandas

What are Dataframes?

Different Approaches to the Task

Leave a Reply Cancel reply