Map External Values to Dataframe in Python | Pandas
In this tutorial, we are going to solve the task of mapping external values to the Pandas dataframe in Python language. We will show many ways to do this.
We see here a term called Pandas Dataframe. Pandas is an open-source library that helps in performing analysis and statistical tasks in Python language. It has to be imported into the code. Dataframes are objects of the Pandas library.
Before proceeding further, let us know more about Dataframes.
What are Dataframes?
Dataframes are mutable data structures that contain three key elements: rows, columns, and data. Further, the use of data frames include making unstructured data more clear and structured so that tasks can be performed on them. Any kind of arithmetic operation is possible on the rows and columns. Let’s see the syntax for this :
pd.DataFrame(data , index , columns , dtype , copy)
Let’s see a code now for this :
# import pandas import pandas as pd # initializing data dataset = {'Name':['Monica', 'Phoebe', 'Ross', 'Chandler', 'Rachel', 'Joey'], 'Age':[26, 23, 30, 28, 25, 29], 'Address':['Kolkata', 'Chennai', 'Agra', 'Mumbai', 'Delhi', 'Lucknow'] } # Convert dictionary into DataFrame df = pd.DataFrame(dataset) # print df df
Output :
Name | Age | Address | |
0 | Monica | 26 | Kolkata |
1 | Phoebe | 23 | Chennai |
2 | Ross | 30 | Agra |
3 | Chandler | 28 | Mumbai |
4 | Rachel | 25 | Delhi |
5 | Joey | 29 | Lucknow |
We will use this dataset for our task.
Different Approaches to the Task
There are many approaches for this. Here, we will discuss two out of them :
Approach 1 : Using map() function
For this,
- Convert the dictionary dataset into a Pandas dataframe and add the column names. It should look like this: pd.DataFrame( dataset, columns= [‘Name’, ‘Age’, ‘Address’]).
- Take external column as input in col.
- Now, use map() function to join the new column col to the dataframe according to the Name column. The command will look like this : df[“Employment”] = df[“Name”].map(col).
# Creating dataframe import pandas as pd dataset = {'Name':['Monica', 'Phoebe', 'Ross', 'Chandler', 'Rachel', 'Joey'], 'Age':[26, 23, 30, 28, 25, 29], 'Address':['Kolkata', 'Chennai', 'Agra', 'Mumbai', 'Delhi', 'Lucknow'] } df = pd.DataFrame(dataset, columns = ['Name', 'Age', 'Address']) # new column for dataframe col = { "Monica":"PWC", "Phoebe":"Cognizant", "Ross":"Microsoft", "Chandler":"Apple", "Rachel":"Philips", "Joey":"Samsung" } # combine this new data with existing DataFrame df["Employment"] = df["Name"].map(col) print(df)
Output :
Name | Age | Address | Employment | |
0 | Monica | 26 | Kolkata | PWC |
1 | Phoebe | 23 | Chennai | Cognizant |
2 | Ross | 30 | Agra | Microsoft |
3 | Chandler | 28 | Mumbai | Apple |
4 | Rachel | 25 | Delhi | Philips |
5 | Joey | 29 | Lucknow | Samsung |
Here, you can see that the column ‘Employment‘ has been added.
Approach 2 : Using replace() function
Now, replace() is a Python function that is used to replace a part of a string with another portion and print a new entire copy. This is what we will be doing here; we will replace a portion of a string with external values. For this :
- Convert the dictionary dataset into dictionary
- Create a new column col with the strings to be replaced and strings that will replace them.
- Use the replace() function to replace the old string and write the strings which will replace them in the parameters. The command will look like this: df.replace({“Name”:col}).
# Create dataframe import pandas as pd dataset = {'Name':['Monica', 'Phoebe', 'Ross', 'Chandler', 'Rachel', 'Joey'], 'Age':[26, 23, 30, 28, 25, 29], 'Address':['Kolkata', 'Chennai', 'Agra', 'Mumbai', 'Delhi', 'Lucknow'] } df = pd.DataFrame(initial_data, columns = ['Name', 'Age', 'Address'] # Create new column col = { "Monica":"Richard", "Ross":"Carol", "Joey":"Kathy" } # replace with external values df = df.replace({"Name":col}) print(df)
Output :
Name | Age | Address | |
0 | Richard | 26 | Kolkata |
1 | Phoebe | 23 | Chennai |
2 | Carol | 30 | Agra |
3 | Chandler | 28 | Mumbai |
4 | Rachel | 25 | Delhi |
5 | Kathy | 29 | Lucknow |
Here you can see that ‘Monica‘, ‘Ross‘ and ‘Joey‘ has been replaced by ‘Richard‘, ‘Carol‘ and ‘Kathy‘ respectively.
Thank you for going through this article. You can check the related articles below :
Leave a Reply