Matthews correlation coefficient

From WikiMD's Wellness Encyclopedia

Matthews Correlation Coefficient (MCC) is a measure used in machine learning and bioinformatics to assess the quality of binary (two-class) classifications. It takes into account true and false positives and negatives and is generally regarded as a balanced measure which can be used even if the classes are of very different sizes. The MCC is in essence a correlation coefficient between the observed and predicted binary classifications; it returns a value between -1 and +1. A coefficient of +1 represents a perfect prediction, 0 no better than random prediction and -1 indicates total disagreement between prediction and observation.

Definition[edit | edit source]

The Matthews Correlation Coefficient is calculated using the formula:

\[ MCC = \frac{TP \times TN - FP \times FN}{\sqrt{(TP+FP) \times (TP+FN) \times (TN+FP) \times (TN+FN)}} \]

where:

  • TP = True Positives
  • TN = True Negatives
  • FP = False Positives
  • FN = False Negatives

Application[edit | edit source]

The MCC is used in various fields, including bioinformatics, where it is used to evaluate the performance of sequence search methods, and in machine learning and data mining, where it is used to assess the performance of classification models. It is particularly useful in situations where the classes are imbalanced, that is, when the number of instances in one class significantly outnumbers the instances in the other class.

Advantages[edit | edit source]

  • Balanced: The MCC takes into account both the size of the positive elements and the size of the negative elements in the dataset, making it a balanced measure.
  • Interpretable: The value of the MCC directly corresponds to the quality of the classification, making it easy to interpret.
  • Applicable to imbalanced datasets: Unlike other metrics such as accuracy, the MCC is not biased towards the majority class in imbalanced datasets.

Limitations[edit | edit source]

  • Sensitivity to small sample sizes: The MCC can be overly optimistic or pessimistic in datasets with very small sample sizes.
  • Not applicable to multi-class problems: The MCC is only defined for binary classification tasks. For multi-class problems, other measures such as the confusion matrix or multi-class versions of the MCC need to be used.

Comparison with Other Metrics[edit | edit source]

The MCC is often compared with other classification metrics such as Precision and Recall, F1 Score, and Accuracy. While accuracy is the most intuitive performance measure, it can be misleading in the presence of imbalanced classes. The F1 score is the harmonic mean of precision and recall, providing a balance between the two, but it does not take into account true negatives. The MCC, by considering all four quadrants of the confusion matrix, provides a more comprehensive measure of classification performance.

See Also[edit | edit source]

References[edit | edit source]


Fisher iris versicolor sepalwidth.svg
   This article is a statistics-related stub. You can help WikiMD by expanding it!



   This article is a bioinformatics-related stub. You can help WikiMD by expanding it!


WikiMD
Navigation: Wellness - Encyclopedia - Health topics - Disease Index‏‎ - Drugs - World Directory - Gray's Anatomy - Keto diet - Recipes

Search WikiMD

Ad.Tired of being Overweight? Try W8MD's physician weight loss program.
Semaglutide (Ozempic / Wegovy and Tirzepatide (Mounjaro / Zepbound) available.
Advertise on WikiMD

WikiMD's Wellness Encyclopedia

Let Food Be Thy Medicine
Medicine Thy Food - Hippocrates

Medical Disclaimer: WikiMD is not a substitute for professional medical advice. The information on WikiMD is provided as an information resource only, may be incorrect, outdated or misleading, and is not to be used or relied on for any diagnostic or treatment purposes. Please consult your health care provider before making any healthcare decisions or for guidance about a specific medical condition. WikiMD expressly disclaims responsibility, and shall have no liability, for any damages, loss, injury, or liability whatsoever suffered as a result of your reliance on the information contained in this site. By visiting this site you agree to the foregoing terms and conditions, which may from time to time be changed or supplemented by WikiMD. If you do not agree to the foregoing terms and conditions, you should not enter or use this site. See full disclaimer.
Credits:Most images are courtesy of Wikimedia commons, and templates Wikipedia, licensed under CC BY SA or similar.

Contributors: Prab R. Tumpati, MD