Map External Values to Dataframe in Python | Pandas

In this tutorial, we are going to solve the task of mapping external values to the Pandas dataframe in Python language. We will show many ways to do this.

We see here a term called Pandas DataframePandas is an open-source library that helps in performing analysis and statistical tasks in Python language. It has to be imported into the code. Dataframes are objects of the Pandas library.

Before proceeding further, let us know more about Dataframes.

What are Dataframes?

Dataframes are mutable data structures that contain three key elements: rowscolumns, and data. Further, the use of data frames include making unstructured data more clear and structured so that tasks can be performed on them. Any kind of arithmetic operation is possible on the rows and columns. Let’s see the syntax for this :

pd.DataFrame(data , index , columns , dtype , copy)

Let’s see a code now for this :

# import pandas
import pandas as pd 
  
# initializing data
dataset = {'Name':['Monica', 'Phoebe', 'Ross', 'Chandler', 'Rachel', 'Joey'], 
    'Age':[26, 23, 30, 28, 25, 29], 
    'Address':['Kolkata', 'Chennai', 'Agra', 'Mumbai', 'Delhi', 'Lucknow'] }

# Convert dictionary into DataFrame 
df = pd.DataFrame(dataset) 

# print df
df

Output :

        Name           Age        Address
0        Monica            26            Kolkata
1         Phoebe            23          Chennai
2             Ross            30                Agra
3     Chandler            28         Mumbai
4         Rachel            25               Delhi
5            Joey            29        Lucknow

We will use this dataset for our task.

Different Approaches to the Task

There are many approaches for this. Here, we will discuss two out of them :

Approach 1 : Using map() function

For this,

  • Convert the dictionary dataset into a Pandas dataframe and add the column names. It should look like this: pd.DataFrame( dataset, columns= [‘Name’, ‘Age’, ‘Address’]).
  • Take external column as input in col.
  • Now, use map() function to join the new column col to the dataframe according to the Name column. The command will look like this : df[“Employment”] = df[“Name”].map(col).
# Creating dataframe 
import pandas as pd 

dataset = {'Name':['Monica', 'Phoebe', 'Ross', 'Chandler', 'Rachel', 'Joey'],
                'Age':[26, 23, 30, 28, 25, 29], 
               'Address':['Kolkata', 'Chennai', 'Agra', 'Mumbai', 'Delhi', 'Lucknow'] }

df = pd.DataFrame(dataset, columns = ['Name', 'Age', 'Address']) 

# new column for dataframe
col = { "Monica":"PWC", 
      "Phoebe":"Cognizant", 
      "Ross":"Microsoft", 
      "Chandler":"Apple", 
      "Rachel":"Philips",
       "Joey":"Samsung" } 

# combine this new data with existing DataFrame 
df["Employment"] = df["Name"].map(col) 

print(df) 

Output :

     Name      Age     Address  Employment
0    Monica       26        Kolkata                   PWC
1      Phoebe       23       Chennai          Cognizant
2           Ross       30             Agra           Microsoft
3   Chandler       28       Mumbai                  Apple
4       Rachel       25            Delhi                Philips
5           Joey       29     Lucknow            Samsung

Here, you can see that the column ‘Employment‘ has been added.

Approach 2 : Using replace() function

Now, replace() is a Python function that is used to replace a part of a string with another portion and print a new entire copy. This is what we will be doing here; we will replace a portion of a string with external values. For this :

  • Convert the dictionary dataset into dictionary
  • Create a new column col with the strings to be replaced and strings that will replace them.
  • Use the replace() function to replace the old string and write the strings which will replace them in the parameters. The command will look like this: df.replace({“Name”:col}). 
# Create dataframe 
import pandas as pd 
dataset = {'Name':['Monica', 'Phoebe', 'Ross', 'Chandler', 'Rachel', 'Joey'], 
                  'Age':[26, 23, 30, 28, 25, 29], 
                  'Address':['Kolkata', 'Chennai', 'Agra', 'Mumbai', 'Delhi', 'Lucknow'] }
df = pd.DataFrame(initial_data, columns = ['Name', 'Age', 'Address']

# Create new column
col = { "Monica":"Richard", 
  "Ross":"Carol", 
  "Joey":"Kathy" }
# replace with external values
df = df.replace({"Name":col}) 
print(df) 

Output :

           Name           Age        Address
0           Richard             26                Kolkata
1            Phoebe             23               Chennai
2                Carol             30                     Agra
3         Chandler             28              Mumbai
4             Rachel             25                    Delhi
5               Kathy             29              Lucknow

Here you can see that ‘Monica‘, ‘Ross‘ and ‘Joey‘ has been replaced by ‘Richard‘, ‘Carol‘ and ‘Kathy‘ respectively.

Thank you for going through this article. You can check the related articles below :

Leave a Reply

Your email address will not be published. Required fields are marked *