Cohen's kappa

From WikiMD's Wellness Encyclopedia

Cohen's kappa coefficient (κ) is a statistic that measures inter-rater reliability for qualitative (categorical) items. It is generally thought to be a more robust measure than simple percent agreement calculation, as κ takes into account the agreement occurring by chance. Cohen's kappa measures the agreement between two raters who each classify N items into C mutually exclusive categories. The formula for κ is:

\[ \kappa = \frac{p_o - p_e}{1 - p_e} \]

where \(p_o\) is the relative observed agreement among raters (proportion of items where both raters agree), and \(p_e\) is the hypothetical probability of chance agreement, using the observed data to calculate the probabilities of each observer randomly saying each category. If the raters are in complete agreement then κ = 1. If there is no agreement among the raters other than what would be expected by chance (as given by \(p_e\)), κ = 0.

Background[edit | edit source]

The kappa statistic was introduced by Jacob Cohen in 1960 as a measure of agreement for nominal scales. It is used in various fields such as healthcare, where it helps to assess the reliability of diagnostic tests, and in machine learning, where it is used to measure the performance of classification algorithms.

Calculation[edit | edit source]

To calculate Cohen's kappa, the number of categories into which assignments can be made must be fixed, and the assignments of each item into these categories by the two raters must be known. The formula involves the calculation of several probabilities:

  • \(p_o\), the observed agreement, is calculated by summing the proportions of items that both raters agree on.
  • \(p_e\), the expected agreement by chance, is calculated by considering the agreement that would occur if both raters assign categories randomly, based on the marginal totals of the categories.

Interpretation[edit | edit source]

The value of κ can be interpreted as follows:

  • A κ of 1 indicates perfect agreement.
  • A κ less than 1 but greater than 0 indicates partial agreement.
  • A κ of 0 indicates no agreement better than chance.
  • A κ less than 0 indicates less agreement than expected by chance.

Landis and Koch (1977) provided a commonly used interpretation for the kappa statistic, suggesting that values ≤ 0 as indicating no agreement, 0.01–0.20 as slight, 0.21–0.40 as fair, 0.41–0.60 as moderate, 0.61–0.80 as substantial, and 0.81–1.00 as almost perfect agreement.

Limitations[edit | edit source]

While Cohen's kappa is a widely used and informative statistic for measuring inter-rater reliability, it has its limitations. It assumes that the two raters have equal status and that the categories are mutually exclusive. Furthermore, kappa may be affected by several factors including the number of categories, the distribution of observations across these categories, and the prevalence of the condition being rated.

Applications[edit | edit source]

Cohen's kappa is used in a variety of settings to assess the reliability of categorical assignments. In healthcare, it is used to evaluate the consistency of diagnostic tests between different raters. In psychology, it helps in assessing the reliability of categorical diagnoses. In content analysis and machine learning, kappa provides a measure of the agreement between human annotators or between an algorithm and a human annotator.

See Also[edit | edit source]


WikiMD
Navigation: Wellness - Encyclopedia - Health topics - Disease Index‏‎ - Drugs - World Directory - Gray's Anatomy - Keto diet - Recipes

Search WikiMD

Ad.Tired of being Overweight? Try W8MD's physician weight loss program.
Semaglutide (Ozempic / Wegovy and Tirzepatide (Mounjaro / Zepbound) available.
Advertise on WikiMD

WikiMD's Wellness Encyclopedia

Let Food Be Thy Medicine
Medicine Thy Food - Hippocrates

Medical Disclaimer: WikiMD is not a substitute for professional medical advice. The information on WikiMD is provided as an information resource only, may be incorrect, outdated or misleading, and is not to be used or relied on for any diagnostic or treatment purposes. Please consult your health care provider before making any healthcare decisions or for guidance about a specific medical condition. WikiMD expressly disclaims responsibility, and shall have no liability, for any damages, loss, injury, or liability whatsoever suffered as a result of your reliance on the information contained in this site. By visiting this site you agree to the foregoing terms and conditions, which may from time to time be changed or supplemented by WikiMD. If you do not agree to the foregoing terms and conditions, you should not enter or use this site. See full disclaimer.
Credits:Most images are courtesy of Wikimedia commons, and templates Wikipedia, licensed under CC BY SA or similar.

Contributors: Prab R. Tumpati, MD