This chapter provides a quick start R code to compute the different statistical measures for analyzing the inter-rater reliability or agreement. These include:
- Cohen’s Kappa: It can be used for either two nominal or two ordinal variables. It accounts for strict agreements between observers. It is most appropriate for two nominal variables.
- Weighted Kappa: It should be considered for two ordinal variables only. It allows partial agreement.
- Light’s Kappa, which is the average of Cohen’s Kappa if using more than two categorical variables.
- Fleiss Kappa: for two or more categorical variables (nominal or ordinal)
- Intraclass correlation coefficient (ICC) for continuous or ordinal data
Related BookInter-Rater Reliability Essentials: Practical Guide in R
There are many R packages and functions for inter-rater agreement analyses, including:
|Measures||R function [package]|
|Cohen’s kappa||Kappa() [vcd], kappa2() [irr]|
|Weighted kappa||Kappa() [vcd], kappa2() [irr]|
|Light’s kappa||kappam.light() [irr]|
|Fleiss Kappa||kappam.fleiss() [irr]|
|ICC||icc() [irr], ICC() [psych]|
In the next sections, we’ll use only the functions from the
irr package. Make sure you have installed it.
Load the package:
# install.packages("irrr") library(irr)
- psychiatric diagnoses data provided by 6 raters [irr package]. A total of 30 patients were enrolled and classified by each of the raters into 5 nominal categories (Fleiss and others 1971): 1. Depression, 2. Personality Disorder, 3. Schizophrenia, 4. Neurosis, 5. Other.
- anxiety data [irr package], which contains the anxiety ratings of 20 subjects, rated by 3 raters on ordinal scales. Values are ranging from 1 (not anxious at all) to 6 (extremely anxious).
Inspect the data:
# Diagnoses data data("diagnoses", package = "irr") head(diagnoses[, 1:3])
## rater1 rater2 rater3 ## 1 4. Neurosis 4. Neurosis 4. Neurosis ## 2 2. Personality Disorder 2. Personality Disorder 2. Personality Disorder ## 3 2. Personality Disorder 3. Schizophrenia 3. Schizophrenia ## 4 5. Other 5. Other 5. Other ## 5 2. Personality Disorder 2. Personality Disorder 2. Personality Disorder ## 6 1. Depression 1. Depression 3. Schizophrenia
# Anxiety data data("anxiety", package = "irr") head(anxiety, 4)
## rater1 rater2 rater3 ## 1 3 3 2 ## 2 3 6 1 ## 3 3 4 4 ## 4 4 6 4
Cohen’s Kappa: two raters
The Cohen’s kappa corresponds to the unweighted kappa. It can be used for two nominal or two ordinal categorical variables
kappa2(diagnoses[, c("rater1", "rater2")], weight = "unweighted")
## Cohen's Kappa for 2 Raters (Weights: unweighted) ## ## Subjects = 30 ## Raters = 2 ## Kappa = 0.651 ## ## z = 7 ## p-value = 2.63e-12
Weighed kappa: ordinal scales
Weighted kappa should be considered only when ratings are performed in ordinal scale as in the following example.
kappa2(anxiety[, c("rater1", "rater2")], weight = "equal")
Light’s kappa: multiple raters
It returns the average Cohen’s kappa when you have multiple raters
## Light's Kappa for m Raters ## ## Subjects = 30 ## Raters = 3 ## Kappa = 0.555 ## ## z = NaN ## p-value = NaN
Fleiss’ kappa: multiple raters
The raters are not assumed to be the same for all subjects.
## Fleiss' Kappa for m Raters ## ## Subjects = 30 ## Raters = 3 ## Kappa = 0.534 ## ## z = 9.89 ## p-value = 0
Intraclass correlation coefficients: continuous scales
Read more in Chapter @ref(intraclass-correlation-coefficient):
icc( anxiety, model = "twoway", type = "agreement", unit = "single" )
## Single Score Intraclass Correlation ## ## Model: twoway ## Type : agreement ## ## Subjects = 20 ## Raters = 3 ## ICC(A,1) = 0.198 ## ## F-Test, H0: r0 = 0 ; H1: r0 > 0 ## F(19,39.7) = 1.83 , p = 0.0543 ## ## 95%-Confidence Interval for ICC Population Values: ## -0.039 < ICC < 0.494
This article describes how to compute the different inter-rater agreement measures using the
Fleiss, J.L., and others. 1971. “Measuring Nominal Scale Agreement Among Many Raters.” Psychological Bulletin 76 (5): 378–82.
Recommended for you
This section contains best data science and self-development resources to help you on your path.
Coursera - Online Courses and Specialization
- Course: Machine Learning: Master the Fundamentals by Stanford
- Specialization: Data Science by Johns Hopkins University
- Specialization: Python for Everybody by University of Michigan
- Courses: Build Skills for a Top Job in any Industry by Coursera
- Specialization: Master Machine Learning Fundamentals by University of Washington
- Specialization: Statistics with R by Duke University
- Specialization: Software Development in R by Johns Hopkins University
- Specialization: Genomic Data Science by Johns Hopkins University
Popular Courses Launched in 2020
- Google IT Automation with Python by Google
- AI for Medicine by deeplearning.ai
- Epidemiology in Public Health Practice by Johns Hopkins University
- AWS Fundamentals by Amazon Web Services
- The Science of Well-Being by Yale University
- Google IT Support Professional by Google
- Python for Everybody by University of Michigan
- IBM Data Science Professional Certificate by IBM
- Business Foundations by University of Pennsylvania
- Introduction to Psychology by Yale University
- Excel Skills for Business by Macquarie University
- Psychological First Aid by Johns Hopkins University
- Graphic Design by Cal Arts
Amazing Selling Machine
- Free Training - How to Build a 7-Figure Amazon FBA Business You Can Run 100% From Home and Build Your Dream Life! by ASM
Books - Data Science
- Practical Guide to Cluster Analysis in R by A. Kassambara (Datanovia)
- Practical Guide To Principal Component Methods in R by A. Kassambara (Datanovia)
- Machine Learning Essentials: Practical Guide in R by A. Kassambara (Datanovia)
- R Graphics Essentials for Great Data Visualization by A. Kassambara (Datanovia)
- GGPlot2 Essentials for Great Data Visualization in R by A. Kassambara (Datanovia)
- Network Analysis and Visualization in R by A. Kassambara (Datanovia)
- Practical Statistics in R for Comparing Groups: Numerical Variables by A. Kassambara (Datanovia)
- Inter-Rater Reliability Essentials: Practical Guide in R by A. Kassambara (Datanovia)
- R for Data Science: Import, Tidy, Transform, Visualize, and Model Data by Hadley Wickham & Garrett Grolemund
- Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurelien Géron
- Practical Statistics for Data Scientists: 50 Essential Concepts by Peter Bruce & Andrew Bruce
- Hands-On Programming with R: Write Your Own Functions And Simulations by Garrett Grolemund & Hadley Wickham
- An Introduction to Statistical Learning: with Applications in R by Gareth James et al.
- Deep Learning with R by François Chollet & J.J. Allaire
- Deep Learning with Python by François Chollet
I ran into an issue while installing the package as install.packages(“irrr”) with the messege “Warning in install.packages: package ‘irrr’ is not available for this version of R” and was able to fix this by instead typing install.packages(“irr”) (So 2 r’s instead of 3 r’s).
Using R 4.1.2