How to draw a line for average value with matplotlib in Python?

Hello everyone!

This article is based on some exciting graph plotting problems and their solutions with Python programming.

Let us consider one of those problems which you might probably encounter while practicing graph plotting. As the name of the article suggests, we need to know how to draw a line for the average value of multiple dependent variables (Y) for each independent variable (X) in Python using the matplotlib pyplot.

Let’s see how we can do this.

Plotting some random scatter plot considering the above problem

Consider the following scatter plot with its corresponding Python program:

import numpy as np
import matplotlib.pyplot as plt
np.random.seed(10)
x = np.random.randint(0, 30,50) #random x points
y = np.random.randint(0, 30,50) #random y points
fig = plt.figure(figsize=(10,8))  #setting the figure size
plt.scatter(x, y,marker='.',color='r')# plotting the scatter plot
plt.title('Scatter plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.show()

scatter plot (average line)

We can see the x and the y values of the above plot by printing them.

print('x=',x.tolist(),'\n')
print('y=',y.tolist())

output (average line plot)

As we can see, the value of x can have different values of y. For example, for x=17 we have y=[23,15,24] i.e three values for one value of x.
What we are going to do is, when plotting the line plot for x and y, we would plot the average value if an x-point has more than one y-points.

We can do this by taking every unique x point and finding all the related points. After this, we need to calculate the mean of these y points. Doing all these steps will generate two lists. One list would contain the unique x points and the other will contain the mean of y points corresponding to each unique x. After this. we will draw the line plot with these lists.

Finding the mean

The Python code for the above process is given below:

import numpy as np

x1=[]  #new x points

y1=[]  #new y points

for xVal in set(x):  # extracting unique values of x
    y_values=[]      # list to store all the y values of each unique x
    for i in range(len(x)):        #|
        if(x[i]==xVal):           #| This part will extract all the 
            y_values.append(y[i]) #| y values of a unique x
    
    x1.append(xVal)    # x1 stores all the unique x points
    
    y1.append(np.mean(y_values))  # y1 is a list containing all the unique y points

After getting the x1 and y1, we can now plot the line plot with x1 as the independent variables and y1 as the dependent variables.

Plotting the average line plot

See the code below which plots the average line plot on the previous scatter plot.

import numpy as np
import matplotlib.pyplot as plt
np.random.seed(10)
x = np.random.randint(0, 30,50)
y = np.random.randint(0, 30,50) #random y points
fig = plt.figure(figsize=(10,8))  #setting the figure size
plt.scatter(x, y,marker='.',color='r')# plotting the scatter plot
plt.plot(x1, y1,color='b') #plotting the average line plot
plt.title('Scatter plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.show()

Output:

scatter plot (average line) 2

I hope you liked the article. Comment if you have any doubts or suggestions regarding this article.

You can also read other articles related to this. Click the links given below.

Leave a Reply

Your email address will not be published. Required fields are marked *