{"id":10304,"date":"2019-11-07T02:56:37","date_gmt":"2019-11-07T00:56:37","guid":{"rendered":"https:\/\/www.datanovia.com\/en\/?post_type=dt_lessons&#038;p=10304"},"modified":"2019-11-07T08:09:58","modified_gmt":"2019-11-07T06:09:58","slug":"cohens-kappa-in-r-for-two-categorical-variables","status":"publish","type":"dt_lessons","link":"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/","title":{"rendered":"Cohen&#8217;s Kappa in R: For Two Categorical Variables"},"content":{"rendered":"<div id=\"rdoc\">\n<p><strong>Cohen\u2019s kappa<\/strong> <span class=\"citation\">(Jacob Cohen 1960, <span class=\"citation\">J Cohen (1968)<\/span>)<\/span> is used to measure the agreement of two raters (i.e., \u201cjudges\u201d, \u201cobservers\u201d) or methods rating on categorical scales. This process of measuring the extent to which two raters assign the same categories or score to the same subject is called <em>inter-rater reliability<\/em>.<\/p>\n<p>Traditionally, the inter-rater reliability was measured as simple overall percent agreement, calculated as the number of cases where both raters agree divided by the total number of cases considered.<\/p>\n<p>This percent agreement is criticized due to its inability to take into account random or expected agreement by chance, which is the proportion of agreement that you would expect two raters to have based simply on chance.<\/p>\n<p>The Cohen\u2019s kappa is a commonly used measure of agreement that removes this chance agreement. In other words, it accounts for the possibility that raters actually guess on at least some variables due to uncertainty.<\/p>\n<p>There are many situation where you can calculate the Cohen\u2019s Kappa. For example, you might use the Cohen\u2019s kappa to determine the agreement between two doctors in diagnosing patients into \u201cgood\u201d, \u201cintermediate\u201d and \u201cbad\u201d prognostic cases.<\/p>\n<div class=\"block\">\n<p>The <strong>Cohen\u2019s kappa<\/strong> can be used for two categorical variables, which can be either two nominal or two ordinal variables. Other variants exists, including:<\/p>\n<ul>\n<li><strong>Weighted kappa<\/strong> to be used only for ordinal variables.<\/li>\n<li><strong>Light\u2019s Kappa<\/strong>, which is just the average of all possible two-raters Cohen\u2019s Kappa when having more than two categorical variables (Conger 1980).<\/li>\n<li><strong>Fleiss kappa<\/strong>, which is an adaptation of Cohen\u2019s kappa for n raters, where n can be 2 or more.<\/li>\n<\/ul>\n<\/div>\n<p>This chapter describes how to measure the inter-rater agreement using the Cohen\u2019s kappa and Light\u2019s Kappa.<\/p>\n<p>You will learn:<\/p>\n<ul>\n<li>The <strong>basics<\/strong>, <strong>formula<\/strong> and step-by-step explanation for <strong>manual calculation<\/strong><\/li>\n<li>Examples of R code to <strong>compute Cohen\u2019s kappa<\/strong> for two raters<\/li>\n<li>How to <strong>calculate Light\u2019s kappa<\/strong> for more than two raters<\/li>\n<li><strong>Interpretation of the kappa coefficient<\/strong><\/li>\n<\/ul>\n<p>Contents:<\/p>\n<div id=\"TOC\">\n<ul>\n<li><a href=\"#basics-and-manual-calculations\">Basics and manual calculations<\/a>\n<ul>\n<li><a href=\"#formula\">Formula<\/a><\/li>\n<li><a href=\"#kappa-for-2x2-tables\">Kappa for 2x2 tables<\/a><\/li>\n<li><a href=\"#kappa-for-two-categorical-variables-with-multiple-levels\">Kappa for two categorical variables with multiple levels<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#interpretation-magnitude-of-the-agreement\">Interpretation: Magnitude of the agreement<\/a><\/li>\n<li><a href=\"#assumptions-and-requirements\">Assumptions and requirements<\/a><\/li>\n<li><a href=\"#statistical-hypotheses\">Statistical hypotheses<\/a><\/li>\n<li><a href=\"#example-of-data\">Example of data<\/a><\/li>\n<li><a href=\"#computing-kappa\">Computing Kappa<\/a>\n<ul>\n<li><a href=\"#kappa-for-two-raters\">Kappa for two raters<\/a><\/li>\n<li><a href=\"#kappa-for-more-than-two-raters\">Kappa for more than two raters<\/a><\/li>\n<\/ul>\n<\/li>\n<li><a href=\"#report\">Report<\/a><\/li>\n<li><a href=\"#summary\">Summary<\/a><\/li>\n<li><a href=\"#references\">References<\/a><\/li>\n<\/ul>\n<\/div>\n<div class='dt-sc-hr-invisible-medium  '><\/div>\n<div class='dt-sc-ico-content type1'><div class='custom-icon' ><a href='https:\/\/www.datanovia.com\/en\/product\/inter-rater-reliability-essentials-practical-guide-in-r\/' target='_blank'><span class='fa fa-book'><\/span><\/a><\/div><h4><a href='https:\/\/www.datanovia.com\/en\/product\/inter-rater-reliability-essentials-practical-guide-in-r\/' target='_blank'> Related Book <\/a><\/h4>Inter-Rater Reliability Essentials: Practical Guide in R<\/div>\n<div class='dt-sc-hr-invisible-medium  '><\/div>\n<div id=\"basics-and-manual-calculations\" class=\"section level2\">\n<h2>Basics and manual calculations<\/h2>\n<div id=\"formula\" class=\"section level3\">\n<h3>Formula<\/h3>\n<p>The <strong>formula of Cohen\u2019s Kappa<\/strong> is defined as follow:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/inter-rater-reliability\/images\/cohen-s-kappa-formula.png\" alt=\"Cohen\u2019s Kappa formula\" \/><\/p>\n<ul>\n<li>Po: proportion of observed agreement<\/li>\n<li>Pe: proportion of chance agreement<\/li>\n<\/ul>\n<div class=\"warning\">\n<p>kappa can range form -1 (no agreement) to +1 (perfect agreement).<\/p>\n<ul>\n<li>when k = 0, the agreement is no better than what would be obtained by chance.<\/li>\n<li>when k is negative, the agreement is less than the agreement expected by chance.<\/li>\n<li>when k is positive, the rater agreement exceeds chance agreement.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<div id=\"kappa-for-2x2-tables\" class=\"section level3\">\n<h3>Kappa for 2x2 tables<\/h3>\n<p>For explaining how to calculate the observed and expected agreement, let\u2019s consider the following contingency table. Two clinical psychologists were asked to diagnose whether 70 individuals are in depression or not.<\/p>\n<p><strong>Data structure<\/strong>:<\/p>\n<pre><code>##        Doctor2\r\n## Doctor1 Yes No Total\r\n##   Yes   a   b  R1   \r\n##   No    c   d  R2   \r\n##   Total C1  C2 N<\/code><\/pre>\n<p>Where:<\/p>\n<ul>\n<li>a, b, c and d are the <strong>observed<\/strong> (O) counts of individuals;<\/li>\n<li>N = a + b + c + d, that is the total table counts;<\/li>\n<li>R1 and R2 are the total of row 1 and 2, respectively. These represent <strong>row margins<\/strong> in the statistics jargon.<\/li>\n<li>C1 and C2 are the total of column 1 and 2, respectively. These are <strong>column margins<\/strong>.<\/li>\n<\/ul>\n<p><strong>Example of data<\/strong>:<\/p>\n<pre><code>##        doctor2\r\n## doctor1 yes no Sum\r\n##     yes  25 10  35\r\n##     no   15 20  35\r\n##     Sum  40 30  70<\/code><\/pre>\n<p><strong>Proportion of observed agreement<\/strong>. The total observed agreement counts is the sum of the diagonal entries. The proportion of observed agreement is: <code>sum(diagonal.values)\/N<\/code>, where N is the total table counts.<\/p>\n<ul>\n<li>25 participants were diagnosed yes by the two doctors<\/li>\n<li>20 participants were diagnosed no by both<\/li>\n<\/ul>\n<p>so, <code>Po = (a + d)\/N = (25 + 20)\/70 = 0.643<\/code><\/p>\n<p><strong>Proportion of chance agreement<\/strong>. The expected proportion of agreement is calculated as follow.<\/p>\n<p>Step 1. Determine the probability that both doctors would randomly say Yes:<\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ol style=\"list-style-type: lower-alpha;\">\n<li>Doctor 1 says yes to 35\/70 (0.5) participants. This represents the row 1 marginal proportion, which is <code>row1.sum\/N<\/code>.<\/li>\n<\/ol>\n<ol style=\"list-style-type: lower-alpha;\" start=\"2\">\n<li>Doctor 2 says yes to 40\/70 (0.57) participants. This represents the column 1 marginal proportion, which is <code>column1.sum\/N<\/code>.<\/li>\n<\/ol>\n<ol style=\"list-style-type: lower-alpha;\" start=\"3\">\n<li>Total probability of both doctors saying yes randomly is <code>0.5*0.57 = 0.285<\/code>. This is the product of row 1 and column 1 marginal proportions.<\/li>\n<\/ol>\n<\/li>\n<\/ul>\n<p>Step 2. Determine the probability that both doctors would randomly say No:<\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ol style=\"list-style-type: lower-alpha;\">\n<li>Doctor 1 says no to 35\/70 (0.5) participants. This is the row 2 marginal proportion: <code>row2.sum\/N<\/code>.<\/li>\n<\/ol>\n<ol style=\"list-style-type: lower-alpha;\" start=\"2\">\n<li>Doctor 2 says no to 30\/70 (0.428) participants. This is the column 2 marginal proportion: <code>column2.sum\/N<\/code>.<\/li>\n<\/ol>\n<ol style=\"list-style-type: lower-alpha;\" start=\"3\">\n<li>Total probability of both doctors saying no randomly is <code>0.5*0.428 = 0.214<\/code>. This is the product of row 2 and column 2 marginal proportions.<\/li>\n<\/ol>\n<\/li>\n<\/ul>\n<p>so, the total expected probability by chance is <code>Pe = 0.285+0.214 = 0.499<\/code>. Technically, this can be seen as the sum of the product of rows and columns marginal proportions: <code>Pe = sum(rows.marginal.proportions x columns.marginal.proportions)<\/code>.<\/p>\n<p><strong>Cohen\u2019s kappa<\/strong>. Finally, the Cohen\u2019s kappa is <code>(0.643 - 0.499)\/(1-0.499) = 0.28<\/code>.<\/p>\n<\/div>\n<div id=\"kappa-for-two-categorical-variables-with-multiple-levels\" class=\"section level3\">\n<h3>Kappa for two categorical variables with multiple levels<\/h3>\n<p>In the previous section, we demonstrated how to manually compute the kappa value for 2x2 table (binomial variables: yes vs no). This can be generalized to categorical variables with multiple levels as follow.<\/p>\n<p>The ratings scores from the two raters can be summarized in a k\u00d7k contingency table, where k is the number of categories.<\/p>\n<p><strong>Example of kxk contingency table<\/strong> to assess agreement about k categories by two different raters:<\/p>\n<pre><code>##           rater2\r\n## rater1     Level.1 Level.2 Level... Level.k Total\r\n##   Level.1  n11     n12     ...      n1k     n1+  \r\n##   Level.2  n21     n22     ...      n2k     n2+  \r\n##   Level... ...     ...     ...      ...     ...  \r\n##   Level.k  nk1     nk2     ...      nkk     nk+  \r\n##   Total    n+1     n+2     ...      n+k     N<\/code><\/pre>\n<p><strong>Terminologies<\/strong>:<\/p>\n<ul>\n<li>The column \u201cTotal\u201d (<code>n1+, n2+, ..., nk+<\/code>) indicates the sum of each row, known as <strong>row margins<\/strong> or marginal counts. Here, the total sum of a given row <code>i<\/code> is named <code>ni+<\/code>.<\/li>\n<li>The row \u201cTotal\u201d (<code>n+1, n+2, ..., n+k<\/code>) indicates the sum of each column, known as <strong>column margins<\/strong>. Here, the total sum of a given column <code>i<\/code> is named <code>n+i<\/code><\/li>\n<li>N is the total sum of all table cells<\/li>\n<li>For a give row\/column, the <strong>marginal proportion<\/strong> is the row\/column margin divide by N. This is also known as the marginal frequencies or probabilities. For a row <code>i<\/code>, the marginal proportion is <code>Pi+ = ni+\/N<\/code>. Similarly, for a given column <code>i<\/code>, the marginal proportion is <code>P+i = n+i\/N<\/code>.<\/li>\n<li>For each table cell, the proportion can be calculated as the cell count divided by N.<\/li>\n<\/ul>\n<p>The <strong>proportion of observed agreement<\/strong> (Po) is the sum of diagonal proportions, which corresponds to the proportion of cases in each category for which the two raters agreed on the assignment.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/inter-rater-reliability\/images\/proportion-of-observed-agreement-formula-2.png\" alt=\"Proportion of observed agreement formula\" \/><\/p>\n<p>The <strong>proportion of chance agreement<\/strong> (Pe) is the sum of the products of the rows and columns marginal proportions:<code>pe = sum(row.marginal.proportions x column.marginal.proportions)<\/code><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/inter-rater-reliability\/images\/proportion-of-expected-agreement-formula-2.png\" alt=\"Proportion of expected (chance) agreement formula\" \/><\/p>\n<p>So, the <strong>Cohen\u2019s kappa<\/strong> can be calculated by plugging Po and Pe in the formula: <code>k = (Po - Pe)\/(1 - Pe)<\/code>.<\/p>\n<p><strong>Kappa confidence intervals<\/strong>. For large sample size, the standard error (SE) of kappa can be computed as follow <span class=\"citation\">(J. L. Fleiss and Cohen 1973, <span class=\"citation\">J. L. Fleiss, Cohen, and Everitt (1969)<\/span>, <span class=\"citation\">Friendly, Meyer, and Zeileis (2015)<\/span>)<\/span>:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/inter-rater-reliability\/images\/kappa-standard-error-formula.png\" alt=\"Standard error of kappa\" \/><\/p>\n<p>Once SE(k) is calculated, a <code>100(1 \u2013 alpha)%<\/code> confidence interval for kappa may be computed using the standard normal distribution as follows:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/dn-tutorials\/inter-rater-reliability\/images\/kappa-confidence-interval-formula.png\" alt=\"Confidence interval of kappa\" \/><\/p>\n<p>For example, the formula of the 95% confidence interval is: <code>k +\/- 1.96 x SE<\/code>.<\/p>\n<p><strong>R code to compute step by step the Cohen\u2019s kappa<\/strong>:<\/p>\n<pre class=\"r\"><code># Contingency table\r\nxtab &lt;- as.table(rbind(c(25, 10), c(15, 20)))\r\n# Descriptive statistics\r\ndiagonal.counts &lt;- diag(xtab)\r\nN &lt;- sum(xtab)\r\nrow.marginal.props &lt;- rowSums(xtab)\/N\r\ncol.marginal.props &lt;- colSums(xtab)\/N\r\n# Compute kappa (k)\r\nPo &lt;- sum(diagonal.counts)\/N\r\nPe &lt;- sum(row.marginal.props*col.marginal.props)\r\nk &lt;- (Po - Pe)\/(1 - Pe)\r\nk<\/code><\/pre>\n<div class=\"block\">\n<p>In the following sections, you will learn a single line R function to compute Kappa.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"interpretation-magnitude-of-the-agreement\" class=\"section level2\">\n<h2>Interpretation: Magnitude of the agreement<\/h2>\n<p>In most applications, there is usually more interest in the magnitude of kappa than in the statistical significance of kappa. The following classifications has been suggested to interpret the strength of the agreement based on the Cohen\u2019s Kappa value <span class=\"citation\">(Altman 1999, <span class=\"citation\">Landis JR (1977)<\/span>)<\/span>.<\/p>\n<table>\n<thead>\n<tr class=\"header\">\n<th>Value of k<\/th>\n<th>Strength of agreement<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr class=\"odd\">\n<td>&lt; 0<\/td>\n<td>Poor<\/td>\n<\/tr>\n<tr class=\"even\">\n<td>0.01 - 0.20<\/td>\n<td>Slight<\/td>\n<\/tr>\n<tr class=\"odd\">\n<td>0.21-0.40<\/td>\n<td>Fair<\/td>\n<\/tr>\n<tr class=\"even\">\n<td>0.41-0.60<\/td>\n<td>Moderate<\/td>\n<\/tr>\n<tr class=\"odd\">\n<td>0.61-0.80<\/td>\n<td>Substantial<\/td>\n<\/tr>\n<tr class=\"even\">\n<td>0.81 - 1.00<\/td>\n<td>Almost perfect<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>However, this interpretation allows for very little agreement among raters to be described as \u201csubstantial\u201d. According to the table 61% agreement is considered as good, but this can immediately be seen as problematic depending on the field. Almost 40% of the data in the dataset represent faulty data. In healthcare research, this could lead to recommendations for changing practice based on faulty evidence. For a clinical laboratory, having 40% of the sample evaluations being wrong would be an extremely serious quality problem <span class=\"citation\">(McHugh 2012)<\/span>.<\/p>\n<p>This is the reason that many texts recommend 80% agreement as the minimum acceptable inter-rater agreement. Any kappa below 0.60 indicates inadequate agreement among the raters and little confidence should be placed in the study results.<\/p>\n<div class=\"block\">\n<p>Fleiss et al. (2003) stated that for most purposes,<\/p>\n<ul>\n<li>values greater than 0.75 or so may be taken to represent excellent agreement beyond chance,<\/li>\n<li>values below 0.40 or so may be taken to represent poor agreement beyond chance, and<\/li>\n<li>values between 0.40 and 0.75 may be taken to represent fair to good agreement beyond chance.<\/li>\n<\/ul>\n<\/div>\n<p>Another logical interpretation of kappa from <span class=\"citation\">(McHugh 2012)<\/span> is suggested in the table below:<\/p>\n<table>\n<thead>\n<tr class=\"header\">\n<th>Value of k<\/th>\n<th>Level of agreement<\/th>\n<th>% of data that are reliable<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr class=\"odd\">\n<td>0 - 0.20<\/td>\n<td>None<\/td>\n<td>0 - 4\u2030<\/td>\n<\/tr>\n<tr class=\"even\">\n<td>0.21 - 0.39<\/td>\n<td>Minimal<\/td>\n<td>4 - 15%<\/td>\n<\/tr>\n<tr class=\"odd\">\n<td>0.40 - 0.59<\/td>\n<td>Weak<\/td>\n<td>15 - 35%<\/td>\n<\/tr>\n<tr class=\"even\">\n<td>0.60 - 0.79<\/td>\n<td>Moderate<\/td>\n<td>35 - 63%<\/td>\n<\/tr>\n<tr class=\"odd\">\n<td>0.80 - 0.90<\/td>\n<td>Strong<\/td>\n<td>64 - 81%<\/td>\n<\/tr>\n<tr class=\"even\">\n<td>Above 0.90<\/td>\n<td>Almost Perfect<\/td>\n<td>82 - 100%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>In the table above, the column \u201c% of data that are reliable\u201d corresponds to the squared kappa, an equivalent of the squared correlation coefficient, which is directly interpretable.<\/p>\n<\/div>\n<div id=\"assumptions-and-requirements\" class=\"section level2\">\n<h2>Assumptions and requirements<\/h2>\n<p>Your data should met the following assumptions for computing Cohen\u2019s Kappa.<\/p>\n<ol style=\"list-style-type: decimal;\">\n<li>You have <strong>two outcome categorical variables<\/strong>, which can be <strong>ordinal<\/strong> or <strong>nominal variables<\/strong>.<\/li>\n<li>The two outcome variables should have exactly the <strong>same categories<\/strong><\/li>\n<li>You have <strong>paired observations<\/strong>; each subject is categorized twice by <strong>two independent raters or methods<\/strong>.<\/li>\n<li>The <strong>same two raters<\/strong> are used for all participants.<\/li>\n<\/ol>\n<\/div>\n<div id=\"statistical-hypotheses\" class=\"section level2\">\n<h2>Statistical hypotheses<\/h2>\n<ul>\n<li><strong>Null hypothesis<\/strong> (H0): <code>kappa = 0<\/code>. The agreement is the same as chance agreement.<\/li>\n<li><strong>Alternative hypothesis<\/strong> (Ha): <code>kappa \u2260 0<\/code>. The agreement is different from chance agreement.<\/li>\n<\/ul>\n<\/div>\n<div id=\"example-of-data\" class=\"section level2\">\n<h2>Example of data<\/h2>\n<p>We\u2019ll use the psychiatric diagnoses data provided by two clinical doctors. 30 patients were enrolled and classified by each of the two doctors into 5 categories <span class=\"citation\">(J. Fleiss and others 1971)<\/span>: 1. Depression, 2. Personality Disorder, 3. Schizophrenia, 4. Neurosis, 5. Other.<\/p>\n<p>The data is organized into the following 5x5 contingency table:<\/p>\n<pre class=\"r\"><code># Demo data\r\ndiagnoses &lt;- as.table(rbind(\r\n  c(7, 1, 2, 3, 0), c(0, 8, 1, 1, 0),\r\n  c(0, 0, 2, 0, 0), c(0, 0, 0, 1, 0),\r\n  c(0, 0, 0, 0, 4)\r\n  ))\r\ncategories &lt;- c(\"Depression\", \"Personality Disorder\",\r\n                \"Schizophrenia\", \"Neurosis\", \"Other\")\r\ndimnames(diagnoses) &lt;- list(Doctor1 = categories, Doctor2 = categories)\r\ndiagnoses<\/code><\/pre>\n<pre><code>##                       Doctor2\r\n## Doctor1                Depression Personality Disorder Schizophrenia Neurosis Other\r\n##   Depression                    7                    1             2        3     0\r\n##   Personality Disorder          0                    8             1        1     0\r\n##   Schizophrenia                 0                    0             2        0     0\r\n##   Neurosis                      0                    0             0        1     0\r\n##   Other                         0                    0             0        0     4<\/code><\/pre>\n<\/div>\n<div id=\"computing-kappa\" class=\"section level2\">\n<h2>Computing Kappa<\/h2>\n<div id=\"kappa-for-two-raters\" class=\"section level3\">\n<h3>Kappa for two raters<\/h3>\n<p>The R function <code>Kappa()<\/code> [vcd package] can be used to compute unweighted and weighted Kappa. The unweighted version corresponds to the Cohen\u2019s Kappa, which are our concern in this chapter. The weighted Kappa should be considered only for ordinal variables and are largely described in Chapter @ref(weighted-kappa).<\/p>\n<pre class=\"r\"><code># install.packages(\"vcd\")\r\nlibrary(\"vcd\")\r\n# Compute kapa\r\nres.k &lt;- Kappa(diagnoses)\r\nres.k<\/code><\/pre>\n<pre><code>##            value    ASE    z Pr(&gt;|z|)\r\n## Unweighted 0.651 0.0997 6.53 6.47e-11\r\n## Weighted   0.633 0.1194 5.30 1.14e-07<\/code><\/pre>\n<pre class=\"r\"><code># Confidence intervals\r\nconfint(res.k)<\/code><\/pre>\n<pre><code>##             \r\n## Kappa          lwr   upr\r\n##   Unweighted 0.456 0.847\r\n##   Weighted   0.399 0.867<\/code><\/pre>\n<div class=\"notice\">\n<p>Note that, in the above results <code>ASE<\/code> is the asymptotic standard error of the kappa value.<\/p>\n<\/div>\n<div class=\"success\">\n<p>In our example, the Cohen\u2019s kappa (k) = 0.65, which represents a fair to good strength of agreement according to Fleiss e al. (2003) classification. This is confirmed by the obtained p-value (p &lt; 0.05), indicating that our calculated kappa is significantly different from zero.<\/p>\n<\/div>\n<\/div>\n<div id=\"kappa-for-more-than-two-raters\" class=\"section level3\">\n<h3>Kappa for more than two raters<\/h3>\n<p>If there are more than 2 raters, then the average of all possible two-raters kappa is known as <strong>Light\u2019s kappa<\/strong> <span class=\"citation\">(Conger 1980)<\/span>. You can compute it using the function <code>kappam.light()<\/code> [irr package], which takes a matrix as input. The matrix columns are raters and rows are individuals.<\/p>\n<pre class=\"r\"><code># install.packages(\"irr\")\r\nlibrary(irr)\r\n# Load and inspect a demo data\r\ndata(\"diagnoses\", package = \"irr\")\r\nhead(diagnoses[, 1:3], 4)<\/code><\/pre>\n<pre><code>##                       Doctor2\r\n## Doctor1                Depression Personality Disorder Schizophrenia\r\n##   Depression                    7                    1             2\r\n##   Personality Disorder          0                    8             1\r\n##   Schizophrenia                 0                    0             2\r\n##   Neurosis                      0                    0             0<\/code><\/pre>\n<pre class=\"r\"><code># Compute Light's kappa between the first 3 raters\r\nkappam.light(diagnoses[, 1:3])<\/code><\/pre>\n<pre><code>##  Light's Kappa for m Raters\r\n## \r\n##  Subjects = 5 \r\n##    Raters = 3 \r\n##     Kappa = 0.172 \r\n## \r\n##         z = 0.69 \r\n##   p-value = 0.49<\/code><\/pre>\n<\/div>\n<\/div>\n<div id=\"report\" class=\"section level2\">\n<h2>Report<\/h2>\n<p>Cohen\u2019s kappa was computed to assess the agreement between two doctors in diagnosing the psychiatric disorders in 30 patients. There was a good agreement between the two doctors, kappa = 0.65 (95% CI, 0.46 to 0.84), p &lt; 0.0001.<\/p>\n<\/div>\n<div id=\"summary\" class=\"section level2\">\n<h2>Summary<\/h2>\n<p>This chapter describes the basics and the formula of the Cohen\u2019s kappa. Additionally, we show how to compute and interpret the kappa coefficient in R. We also provide examples of R code for computing the Light\u2019s Kappa, which is the average of all possible two-raters kappa when you have more than two raters.<\/p>\n<p>Other variants of Cohen\u2019s kappa include: the weighted kappa (for two ordinal variables, Chapter @ref(weighted-kappa)) and Fleiss kappa (for two or more variables, Chapter @ref(weighted-kappa)).<\/p>\n<\/div>\n<div id=\"references\" class=\"section level2 unnumbered\">\n<h2>References<\/h2>\n<div id=\"refs\" class=\"references\">\n<div id=\"ref-Altman1999\">\n<p>Altman, Douglas G. 1999. <em>Practical Statistics for Medical Research<\/em>. Chapman; Hall\/CRC Press.<\/p>\n<\/div>\n<div id=\"ref-Cohen1968\">\n<p>Cohen, J. 1968. \u201cWeighted Kappa: Nominal Scale Agreement with Provision for Scaled Disagreement or Partial Credit.\u201d <em>Psychological Bulletin<\/em> 70 (4): 213\u2014220. doi:<a href=\"https:\/\/doi.org\/10.1037\/h0026256\">10.1037\/h0026256<\/a>.<\/p>\n<\/div>\n<div id=\"ref-Cohen1960\">\n<p>Cohen, Jacob. 1960. \u201cA Coefficient of Agreement for Nominal Scales.\u201d <em>Educational and Psychological Measurement<\/em> 20 (1): 37\u201346. doi:<a href=\"https:\/\/doi.org\/10.1177\/001316446002000104\">10.1177\/001316446002000104<\/a>.<\/p>\n<\/div>\n<div id=\"ref-Conger1980\">\n<p>Conger, A. J. 1980. \u201cIntegration and Generalization of Kappas for Multiple Raters.\u201d <em>Psychological Bulletin<\/em> 88 (2): 322\u201328.<\/p>\n<\/div>\n<div id=\"ref-Fleiss1971\">\n<p>Fleiss, J.L., and others. 1971. \u201cMeasuring Nominal Scale Agreement Among Many Raters.\u201d <em>Psychological Bulletin<\/em> 76 (5): 378\u201382.<\/p>\n<\/div>\n<div id=\"ref-Fleiss1973\">\n<p>Fleiss, Joseph L., and Jacob Cohen. 1973. \u201cThe Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability.\u201d <em>Educational and Psychological Measurement<\/em> 33 (3): 613\u201319. doi:<a href=\"https:\/\/doi.org\/10.1177\/001316447303300309\">10.1177\/001316447303300309<\/a>.<\/p>\n<\/div>\n<div id=\"ref-Fleiss1969\">\n<p>Fleiss, Joseph L., Jacob Willem Cohen, and Brian Everitt. 1969. \u201cLarge Sample Standard Errors of Kappa and Weighted Kappa.\u201d <em>Psychological Bulletin<\/em> 72: 332\u201327.<\/p>\n<\/div>\n<div id=\"ref-Friendly2015\">\n<p>Friendly, Michael, D. Meyer, and A. Zeileis. 2015. <em>Discrete Data Analysis with R: Visualization and Modeling Techniques for Categorical and Count Data<\/em>. 1st ed. Chapman; Hall\/CRC.<\/p>\n<\/div>\n<div id=\"ref-Landis1977\">\n<p>Landis JR, Koch GG. 1977. \u201cThe Measurement of Observer Agreement for Categorical Data\u201d 1 (33). Biometrics: 159\u201374.<\/p>\n<\/div>\n<div id=\"ref-McHugh2012\">\n<p>McHugh, Mary. 2012. \u201cInterrater Reliability: The Kappa Statistic.\u201d <em>Biochemia Medica : \u010casopis Hrvatskoga Dru\u0161tva Medicinskih Biokemi\u010dara \/ HDMB<\/em> 22 (October): 276\u201382. doi:<a href=\"https:\/\/doi.org\/10.11613\/BM.2012.031\">10.11613\/BM.2012.031<\/a>.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p><!--end rdoc--><\/p>\n","protected":false},"excerpt":{"rendered":"<p>This chapter describes the basics and the formula of the Cohen\u2019s kappa for two and more variables. Additionally, we show how to compute and interpret the kappa coefficient in R. <\/p>\n","protected":false},"author":1,"featured_media":9173,"parent":0,"menu_order":0,"comment_status":"open","ping_status":"closed","template":"","class_list":["post-10304","dt_lessons","type-dt_lessons","status-publish","has-post-thumbnail","hentry"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v25.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Cohen&#039;s Kappa in R: Best Reference - Datanovia<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Cohen&#039;s Kappa in R: Best Reference - Datanovia\" \/>\n<meta property=\"og:description\" content=\"This chapter describes the basics and the formula of the Cohen\u2019s kappa for two and more variables. Additionally, we show how to compute and interpret the kappa coefficient in R.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/\" \/>\n<meta property=\"og:site_name\" content=\"Datanovia\" \/>\n<meta property=\"article:modified_time\" content=\"2019-11-07T06:09:58+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040186.JPG.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\n\t<meta property=\"og:image:height\" content=\"512\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"12 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/\",\"url\":\"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/\",\"name\":\"Cohen's Kappa in R: Best Reference - Datanovia\",\"isPartOf\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040186.JPG.jpg\",\"datePublished\":\"2019-11-07T00:56:37+00:00\",\"dateModified\":\"2019-11-07T06:09:58+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/#primaryimage\",\"url\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040186.JPG.jpg\",\"contentUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040186.JPG.jpg\",\"width\":1024,\"height\":512},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.datanovia.com\/en\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Lessons\",\"item\":\"https:\/\/www.datanovia.com\/en\/lessons\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Cohen&#8217;s Kappa in R: For Two Categorical Variables\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#website\",\"url\":\"https:\/\/www.datanovia.com\/en\/\",\"name\":\"Datanovia\",\"description\":\"Data Mining and Statistics for Decision Support\",\"publisher\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.datanovia.com\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#organization\",\"name\":\"Datanovia\",\"url\":\"https:\/\/www.datanovia.com\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png\",\"contentUrl\":\"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png\",\"width\":98,\"height\":99,\"caption\":\"Datanovia\"},\"image\":{\"@id\":\"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Cohen's Kappa in R: Best Reference - Datanovia","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/","og_locale":"en_US","og_type":"article","og_title":"Cohen's Kappa in R: Best Reference - Datanovia","og_description":"This chapter describes the basics and the formula of the Cohen\u2019s kappa for two and more variables. Additionally, we show how to compute and interpret the kappa coefficient in R.","og_url":"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/","og_site_name":"Datanovia","article_modified_time":"2019-11-07T06:09:58+00:00","og_image":[{"width":1024,"height":512,"url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040186.JPG.jpg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"12 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/","url":"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/","name":"Cohen's Kappa in R: Best Reference - Datanovia","isPartOf":{"@id":"https:\/\/www.datanovia.com\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/#primaryimage"},"image":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/#primaryimage"},"thumbnailUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040186.JPG.jpg","datePublished":"2019-11-07T00:56:37+00:00","dateModified":"2019-11-07T06:09:58+00:00","breadcrumb":{"@id":"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/#primaryimage","url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040186.JPG.jpg","contentUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2019\/05\/P1040186.JPG.jpg","width":1024,"height":512},{"@type":"BreadcrumbList","@id":"https:\/\/www.datanovia.com\/en\/lessons\/cohens-kappa-in-r-for-two-categorical-variables\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.datanovia.com\/en\/"},{"@type":"ListItem","position":2,"name":"Lessons","item":"https:\/\/www.datanovia.com\/en\/lessons\/"},{"@type":"ListItem","position":3,"name":"Cohen&#8217;s Kappa in R: For Two Categorical Variables"}]},{"@type":"WebSite","@id":"https:\/\/www.datanovia.com\/en\/#website","url":"https:\/\/www.datanovia.com\/en\/","name":"Datanovia","description":"Data Mining and Statistics for Decision Support","publisher":{"@id":"https:\/\/www.datanovia.com\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.datanovia.com\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.datanovia.com\/en\/#organization","name":"Datanovia","url":"https:\/\/www.datanovia.com\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/","url":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png","contentUrl":"https:\/\/www.datanovia.com\/en\/wp-content\/uploads\/2018\/09\/datanovia-logo.png","width":98,"height":99,"caption":"Datanovia"},"image":{"@id":"https:\/\/www.datanovia.com\/en\/#\/schema\/logo\/image\/"}}]}},"multi-rating":{"mr_rating_results":[]},"_links":{"self":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/10304","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons"}],"about":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/types\/dt_lessons"}],"author":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/comments?post=10304"}],"version-history":[{"count":1,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/10304\/revisions"}],"predecessor-version":[{"id":10312,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/dt_lessons\/10304\/revisions\/10312"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/media\/9173"}],"wp:attachment":[{"href":"https:\/\/www.datanovia.com\/en\/wp-json\/wp\/v2\/media?parent=10304"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}