Transform Pandas columns using map and apply

In this tutorial, we will use map() and apply() methods to transform Pandas columns. While working with datasets, there will be many situations where you need to transform and manipulate data. These methods are helpful when you need to map one set of values to another set of values by using a function.

Using map() method

The map function is handy as it can take three different shapes i.e., dictionaries, functions, and series. Refer Mapping values in a Pandas Dataframe to know step by step mapping process. Let’s look at each of these scenarios,

Using the map method to map a dictionary

When you pass a dictionary to map() method, it matches column values with keys in the dictionary. Now, the corresponding key values are added to our data frame.

Example:

import pandas as pd
data = pd.DataFrame({
    'e_name': ['Janu', 'Yash', 'Misande', 'Edward', 'Kate'],
    'e_age': [30, 40, 32, 67, 43],
    'e_score': [900, 950, 970, 820, 870],
    'e_income':[100000, 80000, 550000, 62000, 50000]})
print(data)
genders = {'Janu': 'Female', 'Yash': 'Male', 'Misande': 'Female', 'Edward': 'Male', 'Kate': 'Female'}
data['gender'] = data['e_name'].map(genders)
print(data)

Output:

    e_name  e_age e_score  e_income
0     Janu     30     900    100000
1     Yash     40     950     80000
2  Misande     32     970    550000
3   Edward     67     820     62000
4     Kate     43     870     50000
    e_name  e_age e_score  e_income  gender
0     Janu     30     900    100000  Female
1     Yash     40     950     80000    Male
2  Misande     32     970    550000  Female
3   Edward     67     820     62000    Male
4     Kate     43     870     50000  Female

Using the map method to map a Function

Here, we pass a function to map() method. It takes in a value from the series and returns a new value that will be part of a new series.

Example:

mean_score = data['e_score'].mean()
def high_score(x):
    return x > mean_score
data['higher_score'] = data['e_score'].map(high_score)
print(data)

Output:

    e_name  e_age  e_score  e_income  gender  higher_score
0     Janu     30      900    100000  Female         False
1     Yash     40      950     80000    Male          True
2  Misande     32      970    550000  Female          True
3   Edward     67      820     62000    Male         False
4     Kate     43      870     50000  Female         False
  • We can also pass an anonymous Lambda function as we are using the function only once. Then the piece of code will be simpler like,
data['higher_score'] = data['e_score'].map(lambda x: x > mean_score)

Using the map method to map an Indexed Series

Finally, we are going to learn how to pass a Pandas Series to map() method. It overwrites the values in the series applied using the values from the series passed.

Example:

last_names = pd.Series(['Smith', 'Taylor', 'Jones', 'Harris', 'Parker'], index=data['e_name'])
data['last_name'] = data['e_name'].map(last_names)
print(data)

Output:

    e_name  e_age  e_score  e_income  gender  higher_score last_name
0     Janu     30      900    100000  Female         False     Smith
1     Yash     40      950     80000    Male          True    Taylor
2  Misande     32      970    550000  Female          True     Jones
3   Edward     67      820     62000    Male         False    Harris
4     Kate     43      870     50000  Female         False    Parker

Using apply() method

The apply() method can be used on either a Pandas series or a Data frame. Unlike map() method, it can only take a function.

Example:

def project(row):
    return row['e_age'] < 45 and row['e_income'] > 75000
data['project'] = data.apply(project, axis=1)
print(data)

Output:

    e_name  e_age  e_score  e_income    project
0     Janu     30      900    100000       True
1     Yash     40      950     80000       True
2  Misande     32      970    550000       True
3   Edward     67      820     62000      False
4     Kate     43      870     50000      False

There is also a way of passing arguments to the function in apply method. It is done by using ‘args‘ parameter inside the function.

Also read,

Leave a Reply

Your email address will not be published. Required fields are marked *