Map External Values to Dataframe in Python | Pandas

In this tutorial, we are going to solve the task of mapping external values to the Pandas dataframe in Python language. We will show many ways to do this.

We see here a term called Pandas DataframePandas is an open-source library that helps in performing analysis and statistical tasks in Python language. It has to be imported into the code. Dataframes are objects of the Pandas library.

Before proceeding further, let us know more about Dataframes.

What are Dataframes?

Dataframes are mutable data structures that contain three key elements: rowscolumns, and data. Further, the use of data frames include making unstructured data more clear and structured so that tasks can be performed on them. Any kind of arithmetic operation is possible on the rows and columns. Let’s see the syntax for this :

pd.DataFrame(data , index , columns , dtype , copy)

Let’s see a code now for this :

# import pandas
import pandas as pd 
  
# initializing data
dataset = {'Name':['Monica', 'Phoebe', 'Ross', 'Chandler', 'Rachel', 'Joey'], 
    'Age':[26, 23, 30, 28, 25, 29], 
    'Address':['Kolkata', 'Chennai', 'Agra', 'Mumbai', 'Delhi', 'Lucknow'] }

# Convert dictionary into DataFrame 
df = pd.DataFrame(dataset) 

# print df
df

Output :

        Name            Age         Address
0         Monica             26             Kolkata
1          Phoebe             23           Chennai
2              Ross             30                 Agra
3      Chandler             28          Mumbai
4          Rachel             25                Delhi
5             Joey             29         Lucknow

We will use this dataset for our task.

Different Approaches to the Task

There are many approaches for this. Here, we will discuss two out of them :

Approach 1 : Using map() function

For this,

  • Convert the dictionary dataset into a Pandas dataframe and add the column names. It should look like this: pd.DataFrame( dataset, columns= [‘Name’, ‘Age’, ‘Address’]).
  • Take external column as input in col.
  • Now, use map() function to join the new column col to the dataframe according to the Name column. The command will look like this : df[“Employment”] = df[“Name”].map(col).
# Creating dataframe 
import pandas as pd 

dataset = {'Name':['Monica', 'Phoebe', 'Ross', 'Chandler', 'Rachel', 'Joey'],
                'Age':[26, 23, 30, 28, 25, 29], 
               'Address':['Kolkata', 'Chennai', 'Agra', 'Mumbai', 'Delhi', 'Lucknow'] }

df = pd.DataFrame(dataset, columns = ['Name', 'Age', 'Address']) 

# new column for dataframe
col = { "Monica":"PWC", 
      "Phoebe":"Cognizant", 
      "Ross":"Microsoft", 
      "Chandler":"Apple", 
      "Rachel":"Philips",
       "Joey":"Samsung" } 

# combine this new data with existing DataFrame 
df["Employment"] = df["Name"].map(col) 

print(df) 

Output :

     Name       Age      Address   Employment
0     Monica        26         Kolkata                    PWC
1       Phoebe        23        Chennai           Cognizant
2            Ross        30              Agra            Microsoft
3    Chandler        28        Mumbai                   Apple
4        Rachel        25             Delhi                 Philips
5            Joey        29      Lucknow             Samsung

Here, you can see that the column ‘Employment‘ has been added.

Approach 2 : Using replace() function

Now, replace() is a Python function that is used to replace a part of a string with another portion and print a new entire copy. This is what we will be doing here; we will replace a portion of a string with external values. For this :

  • Convert the dictionary dataset into dictionary
  • Create a new column col with the strings to be replaced and strings that will replace them.
  • Use the replace() function to replace the old string and write the strings which will replace them in the parameters. The command will look like this: df.replace({“Name”:col}). 
# Create dataframe 
import pandas as pd 
dataset = {'Name':['Monica', 'Phoebe', 'Ross', 'Chandler', 'Rachel', 'Joey'], 
                  'Age':[26, 23, 30, 28, 25, 29], 
                  'Address':['Kolkata', 'Chennai', 'Agra', 'Mumbai', 'Delhi', 'Lucknow'] }
df = pd.DataFrame(initial_data, columns = ['Name', 'Age', 'Address']

# Create new column
col = { "Monica":"Richard", 
  "Ross":"Carol", 
  "Joey":"Kathy" }
# replace with external values
df = df.replace({"Name":col}) 
print(df) 

Output :

           Name            Age         Address
0            Richard              26                 Kolkata
1             Phoebe              23                Chennai
2                 Carol              30                      Agra
3          Chandler              28               Mumbai
4              Rachel              25                     Delhi
5                Kathy              29               Lucknow

Here you can see that ‘Monica‘, ‘Ross‘ and ‘Joey‘ has been replaced by ‘Richard‘, ‘Carol‘ and ‘Kathy‘ respectively.

Thank you for going through this article. You can check the related articles below :

Leave a Reply

Your email address will not be published. Required fields are marked *