How to Normalize a Pandas DataFrame Column

In this tutorial, you will learn how to Normalize a Pandas DataFrame column with Python code. Normalizing means, that you will be able to represent the data of the column in a range between 0 to 1.

At first, you have to import the required modules which can be done by writing the code as:

import pandas as pd
from sklearn import preprocessing

Along with the above line of code, you will write one more line as:

%matplotlib inline

What this does is, basically it just represents graphs that you create with your project will be projected in the same window and not in a different window.
Now let’s create data that you will be working on:

data = {'data_range': [100,55,33,29,-57,56,93,-8,79,120]}
data_frame = pd.DataFrame(data)

This will just show our unnormalized data as:
unnormalized data

We can also plot this above-unnormalized data as a bar graph by using the command as:


The graph of our unnormalized data is:
unnormalized data

It can be seen clearly from the graph that our data is unnormalized, and now you will be using various preprocessing tools to convert it into a normalized data.

A = data_frame.values #returns an array
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(A)

Where A is nothing but just a Numpy array and MinMaxScaler() converts the value of unnormalized data to float and x_scaled contains our normalized data.
We can also see our normalized data that x_scaled contains as:

normalized_dataframe = pd.DataFrame(x_scaled)

The results of the above command will be:
normalized data

Now you can plot and show normalized data on a graph by using the following line of code:


plot and show normalized data

So we are able to Normalize a Pandas DataFrame Column successfully in Python. I hope, you enjoyed doing the task.

Also, read: Drop Rows and Columns in Pandas with Python Programming

Leave a Reply

Your email address will not be published. Required fields are marked *