# Find variance of a list in Python

This article is going to help you understand how to find variance of numbers ordered in a list. We will look at 3 methods to find the variance of a list in Python. You can implement any of the three discussed methods you like. Each of the method is simple and straightforward.

Let’s consider a common, simple list for all the 3 examples.

arr = [4,5,6,7]

It is important to know the formula of variance when implementing it in a program. Variance refers to the average of squared differences from the mean.

variance = Σ (Xi – Xm)2 / N ; where,

Xi = ith observation ;

Xm = mean of all observations ;

N = total number of observation

Let’s calculate variance for over list arr in Python.

### Method 1: Mean -> List Comprehension -> Variance

This method can be enlisted in simple steps:

- Find mean of all the elements in the list
- Using List comprehension find the squared differences of each element with mean
- Calculate variance as the sum of all the squared differences divided by mean

def variance_1(arr): mean = sum(arr)/len(arr) #step 1 temp = [(i-mean)**2 for i in arr] #step 2 variance = sum(temp)/len(arr) #step 3 return variance

### Method 2: Using statistics module of Python

The function statistics.pvariance(array) returns the variance of the inputted “array” as a parameter.

import statistics def variance_2(arr): return statistics.pvariance(arr)

### Method 3: Using NumPy library

The NumPy library can be used to calculate variance for 1-D as well as higher dimensional array (2-D, 3-D, etc.). It uses the function NumPy.var(array) and returns the variance of the inputted “array” as a parameter.

import numpy as np def variance_3(arr): return np.var(arr)

Now that we have defined 3 functions to calculate variance, let’s see their results for our list arr.

arr = [4,5,6,7] print("original array: ", arr) print("Variance of the data using method 1: ", variance_1(arr)) print("Variance of the data using method 2: ", variance_3(arr)) print("Variance of the data using method 3: ", variance_3(arr))

Output:

original array: [4, 5, 6, 7] Variance of the data using method 1: 1.25 Variance of the data using method 2: 1.25 Variance of the data using method 3: 1.25

**Extra Tip: **When using arrays in dimensions higher than 1D, use NumPy library and set parameter “axis=0(default)”. Change the axis parameter along which you need to calculate variance.

Also, go ahead and modify the code above to use it for your own data. I hope you learned something new. Let me know in the comments if you have any doubts. Cheers!

Further Reading:

## Leave a Reply