How to compute the covariance of a given data frame using Dataframe.cov() in Pandas
In this tutorial, we will learn how to compute the covariance of a given data frame. The output will be a covariance matrix. This is commonly used in the process of computing the data. If the data frame consists of Nan values, in the final result these values are removed by having the values of covariance. It computes the covariance of the columns. So, let’s begin the tutorial.
Parameters of dataframe.cov()
This method has the following parameters
pandas.Dataframe.cov(min_periods)
min_periods: It is an optional parameter. Minimum number of observations required per pair of columns in order to bring a valid result.
If no parameter is passed, simply the covariance matrix is given as output.
Example 1
Let us consider a data frame consisting of the following two columns.
import pandas as p data={'f':[30,190,583,200,1], 's':[9,35,678,265,909]} d=p.DataFrame(data) print(d)
OUTPUT:
f s 0 30 9 1 190 35 2 583 678 3 200 265 4 1 909
Using cov() without any parameters
We will now use the cov()
method on the above data frame.
import pandas as p data={'f':[30,190,583,200,1], 's':[9,35,678,265,909]} d=p.DataFrame(data) print(d.cov())
OUTPUT:
f s f 53821.70 18846.55 s 18846.55 159633.20
This is the covariance matrix.
Example 2
Let us consider the data frame consisting of the following two columns.
import pandas as p data={'f':[30,None,583,None,1], 's':[9,None,678,265,909]} d=p.DataFrame(data) print(d)
OUTPUT:
f s 0 30.0 9.0 1 NaN NaN 2 583.0 678.0 3 NaN 265.0 4 1.0 909.0
Using cov() with min_periods parameter
We will now use the cov() method on the above data frame.
import pandas as p data={'f':[30,None,583,None,1], 's':[9,None,678,265,909]} d=p.DataFrame(data) print(d.cov(min_periods=3))
OUTPUT:
f s f 107562.333333 34902.50 s 34902.500000 163480.25
Here, in the final matrix, there are no Nan values. The value of min_periods is 3.
So, we have observed the ways to determine the covariance of a data frame.
Leave a Reply