Find variance of a list in Python

Post Views: 846

This article is going to help you understand how to find variance of numbers ordered in a list. We will look at 3 methods to find the variance of a list in Python. You can implement any of the three discussed methods you like. Each of the method is simple and straightforward.

Let’s consider a common, simple list for all the 3 examples.
arr = [4,5,6,7]

It is important to know the formula of variance when implementing it in a program. Variance refers to the average of squared differences from the mean.

variance = Σ (Xi – Xm)2 / N ; where,
Xi = ith observation ;
Xm = mean of all observations ;
N = total number of observation

Let’s calculate variance for over list arr in Python.

Method 1: Mean -> List Comprehension -> Variance

This method can be enlisted in simple steps:

Find mean of all the elements in the list
Using List comprehension find the squared differences of each element with mean
Calculate variance as the sum of all the squared differences divided by mean

def variance_1(arr):
  mean = sum(arr)/len(arr)  #step 1
  temp = [(i-mean)**2 for i in arr]  #step 2
  variance = sum(temp)/len(arr)  #step 3
  return variance

Method 2: Using statistics module of Python

The function statistics.pvariance(array) returns the variance of the inputted “array” as a parameter.

import statistics
def variance_2(arr):
  return statistics.pvariance(arr)

Method 3: Using NumPy library

The NumPy library can be used to calculate variance for 1-D as well as higher dimensional array (2-D, 3-D, etc.). It uses the function NumPy.var(array) and returns the variance of the inputted “array” as a parameter.

import numpy as np
def variance_3(arr):
  return np.var(arr)

Now that we have defined 3 functions to calculate variance, let’s see their results for our list arr.

arr = [4,5,6,7]
print("original array: ", arr)
print("Variance of the data using method 1: ", variance_1(arr))
print("Variance of the data using method 2: ", variance_3(arr))
print("Variance of the data using method 3: ", variance_3(arr))

Output:

original array: [4, 5, 6, 7]
Variance of the data using method 1: 1.25
Variance of the data using method 2: 1.25
Variance of the data using method 3: 1.25

Extra Tip: When using arrays in dimensions higher than 1D, use NumPy library and set parameter “axis=0(default)”. Change the axis parameter along which you need to calculate variance.

Also, go ahead and modify the code above to use it for your own data. I hope you learned something new. Let me know in the comments if you have any doubts. Cheers!

Find variance of a list in Python

Method 1: Mean -> List Comprehension -> Variance

Method 2: Using statistics module of Python

Method 3: Using NumPy library

Leave a Reply Cancel reply

Related Posts