Calculate Intraclass Correlation Coefficient in Python

In this tutorial, we will learn about the Intraclass Correlation Coefficient, a statistical measure with many practical applications.

Intraclass Correlation Coefficient

Intraclass Correlation Coefficient (ICC) is used to evaluate the results or ratings or any type of measurements made by multiple observers or judges. Here, judges can refer to people who are evaluating. It tells the reliability of the results by quantifying the degree of like-mindedness between multiple observers when evaluating the same thing. So, basically, it tells whether the evaluators are in unison or not.

ICC divides the total variance into Subjects, Raters, and Residual Variances.

  • Subject Variance: It represents the extent of differences between the subjects regarding the measured variable.
  • Raters Variance: It represents the extent of differences in ratings from different raters for the same subject.
  • Residual Variance: The rest of the variance that is not accounted for either in Subject or in Raters.

Code

Before starting, make sure you have installed the library pingouin. If not, install it using the below code:

pip install pingouin

I will be using the sample dataset provided by the pingouin library itself. The dataset consists of different samples of wine, and there are different judges to evaluate the wines and give a score to it.

import pingouin as pg

# Load the sample dataset
data = pg.read_dataset('icc')

data.head(10)

Now calculate the ICC scores using the pg.intraclass_corr())which takes 4 parameters:

  • data: Pass the dataset here
  • targets: It refers to the subjects that will be evaluated. Here, it’s the Wine column.
  • raters: It refers to judges or observers who will be evaluating the subject.
  • ratings: It refers to the ratings or scores given by the judges or observers.
# Calculate ICC scores
icc_result = pg.intraclass_corr(data=data, targets='Wine', raters='Judge', ratings='Scores')

icc_result

Here, the function returns the 6 types of ICC scores, along with their description. The explanation of other quantities is given below:

  • F: It is the F-value, a statistical measure that is the ratio of the mean square of raters to the mean square of subjects.
  • df1: It represents the degree of freedom between groups. Mathematically, it is the Number of raters or subjects minus 1.
  • df2: It represents the degree of freedom within groups. Mathematically, it is the total number of observations minus the total number of subjects or raters.
  • pval: It is the p-value, a statistical measure often used in hypothesis testing. Here, we do hypothesis testing to comment on the significance of variability between subjects or evaluators statistically.
  • CI95%: It represents the 95% Confidence Interval from the distribution.

Leave a Reply

Your email address will not be published. Required fields are marked *