Seaborn Multiple Line Plot in Python
In this article you are going to learn multiple line plot in Python using seaborn module.
Visualization makes the data easy to understand because through it we can generate any kind of insights from the data be it mathematical, statistical, etc.
Dataset link is given at the bottom of this tutorial.
That is the power of Python visualization libraries which can portray the entire story of data in just a few plots. Python has a lot of features to visualize the data. It offers a plethora of data exploring and visualizing opportunities. It has many built-in modules used for visualization like matplotlib, seaborn, plotly, etc. Working with the seaborn library is more interactive than matplotlib due to a vast variety of plots and features it offers. Multiple line plot is used to plot a graph between two attributes consisting of numeric data.
For plotting multiple line plots, first install the seaborn module into your system.
Install seaborn using pip
pip manages packages and libraries for Python. It additionally installs all the dependencies and modules that are not in-built.
Just a single pip install command gets all your installation work done. That is how concise Python is!
It is also possible to install using conda in the anaconda terminal through the statement-
conda install seaborn
Type the following command in your terminal.
!pip install seaborn
Importing the required modules and packages in Python using the ‘import’ command.
For working with this dataset, we need to import pandas, matplotlib, and seaborn module.
- Pandas work for data manipulation, processing, and analysis. Particularly, it offers operations for manipulating data frames and time series. It helps us with the data cleaning part.
- matplotlib.pyplot function works with the figure like creating the figure, creating a plotting area in the figure, plotting lines in the plotting area and adding labels, etc.
- seaborn, an extension of Python matplotlib visualization library provides techniques for drawing attractive graphs.
Note: Matplotlib offers many basic visualizations like line, bar, scatter, pies, etc. Seaborn on the other hand offers numerous visualization options like KDE plot, rugplot, boxplot, violin plot, swarm plot, heatmap, facetgrid, regplot, and the list is endless. Seaborn works with less syntax when compared with matplotlib.
# import pandas module for data analysis import pandas as pd # import seaborn and matplot library for visualization import seaborn as sns import matplotlib.pyplot as plt
A picture is worth a thousand words. With advanced tools, such a picture is drawn in just a few lines of code.
Seaborn module contains a function ‘sns.lineplot()‘, through which we can plot a single line and multiple lines plot using its parameters. Line plots work well when you want to analyze changes in one variable concerning another
syntax: lineplot in seaborn
x=None,y=None, hue=None, size=None, style=None, data=None, palette=None, hue_order=None, hue_norm=None, sizes=None, size_order=None, size_norm=None, dashes=True, markers=None, style_order=None, units=None, estimator=’mean’, ci=95, n_boot=1000, sort=True, err_style=’band’, err_kws=None, legend=’brief’, ax=None, **kwargs,
- x, y: represent names of variables in the data set to use as input variables.
- data: data frame object pointing to the data set
- hue: grouping variables to generate lines of different colors.
- size: to specify line size.
- style: to specify line style.
- palette: colors to use for different categories of hue.
- hue_order: order for the appearance of hue variables.
Let’s begin with importing the CSV dataset on which we are going to perform the visualization. This is done through Python pandas which reads the CSV imported and converts it into a dataframe object which can be manipulated when required. We have imported an automobile data set with prices and different types of automobiles with various other characteristics.
data = pd.read_csv(r'C:\Users\Kunwar\Downloads\Automobile_data_processed.csv') # to read csv file data.head(10)
In the above code,
- the read_csv function of pandas imports the CSV file into the dataframe object ‘data’.
- head() method displays the specified number of rows from the first row. Here, it displays the first 10 rows.
check the size of the data frame:
data.shape # it will give the size in row-column format
Output: (159, 26)
shape method defines the size of the data. It gives the number of rows and columns in the dataframe.
plot the single line graph:
horsepower and price are two continuous data variables in our data set. Let’s analyze the relationship between these two variables through a simple line plot.
plot the graph between the horsepower and price.
#plot the graph between x and y (both should be an attribute from the dataframe table) sns.lineplot(x = "horsepower", y = "price", data = data) plt.plot()
- x: represents horsepower on x-axis
- y: represents price on y-axis
- data: data frame object pointing to the entire data set.
Through this plot, we got to know that there is a kind of linear relationship between price and horsepower. As horsepower increases, the price of the vehicle increases as well. We get to know that for a particular variety of vehicles with horsepower 180, the price range lies near 30000.
plot the multiple line graph:
Here, it plots multiple lines on the same graph. We differentiate between them specifying a label. This label shows up at either corner of the image. If we want to use multiple line plots of seaborn for exploring the relationship between two continuous variables, we need to use hue argument. hue takes as a parameter a variable name according to which data is segregated. It renders different line plots for the segregated data.
plot the graph between horsepower and price according the fuel-type
# set the size of the frame of image plt.figure(figsize = (20,12)) sns.lineplot(x = "horsepower", y = "price", data = data, hue = "fuel-type") plt.show()
plt.figure() specifies the size of the figure we want to create.
- plt.show()- to view the figure.
Through this plot, we got to know that there is a kind of linear relationship between price and horsepower as we have seen earlier. As horsepower increases, the price of the vehicles increases as well. What’s different in this data is the hue argument. By specifying fuel-type in hue, we segregated the data into two groups, one with all the vehicles that run on gas and another set of diesel-driven vehicles. We get two line plots in the above figure. The orange line represents the relationship between price and horsepower of all the vehicles with fuel-type as diesel and blue represents all vehicles with fuel type as gas.
The small rectangular box at the top right corner giving information about the type of line is a legend.
Now, we can easily say that a diesel type vehicle with a horsepower power of 120 has a price value of somewhere around 25000.
For downloading the automobile data set and creating your visualizations, click on the link mentioned below:
Leave a Reply