Kolmogorov–Smirnov test

From WikiMD's Wellness Encyclopedia

Kolmogorov–Smirnov test (K–S test) is a nonparametric test used in statistics to compare a sample with a reference probability distribution (one-sample K–S test), or to compare two samples (two-sample K–S test). It is named after Andrey Kolmogorov and Nikolai Smirnov. The K–S test quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution, or between the empirical distribution functions of two samples. The null hypothesis of the test is that the sample comes from the same distribution as the reference distribution (in the one-sample case), or that the two samples come from the same distribution (in the two-sample case).

Definition[edit | edit source]

The Kolmogorov–Smirnov statistic quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution, or between the empirical distribution functions of two samples. For a one-sample K–S test, the test statistic is:

\[D_n = \sup_x |F_n(x) - F(x)|\]

where \(F_n(x)\) is the empirical distribution function of the sample and \(F(x)\) is the cumulative distribution function of the reference distribution. For a two-sample K–S test, the test statistic is:

\[D_{n,m} = \sup_x |F_{n}(x) - G_{m}(x)|\]

where \(F_{n}(x)\) and \(G_{m}(x)\) are the empirical distribution functions of the two samples, with \(n\) and \(m\) being the sizes of the samples, respectively.

Applications[edit | edit source]

The K–S test is widely used in situations where the form of the distribution is not known and for comparing the goodness-of-fit of empirical data to a theoretical model. It is particularly useful in the fields of statistics, economics, psychology, and environmental science, among others.

Advantages and Limitations[edit | edit source]

One of the main advantages of the K–S test is its nonparametric nature, meaning it does not assume a specific distribution for the data. However, the test has less power than some alternatives, such as the Anderson-Darling test, especially for small sample sizes or when the differences between distributions are in the tails.

Implementation[edit | edit source]

The K–S test has been implemented in various statistical software packages, including R, Python's SciPy library, and MATLAB. These implementations typically provide functions to perform both one-sample and two-sample tests, along with options to adjust for the effect of discrete data or ties.

See Also[edit | edit source]


WikiMD
Navigation: Wellness - Encyclopedia - Health topics - Disease Index‏‎ - Drugs - World Directory - Gray's Anatomy - Keto diet - Recipes

Search WikiMD

Ad.Tired of being Overweight? Try W8MD's physician weight loss program.
Semaglutide (Ozempic / Wegovy and Tirzepatide (Mounjaro / Zepbound) available.
Advertise on WikiMD

WikiMD's Wellness Encyclopedia

Let Food Be Thy Medicine
Medicine Thy Food - Hippocrates

Medical Disclaimer: WikiMD is not a substitute for professional medical advice. The information on WikiMD is provided as an information resource only, may be incorrect, outdated or misleading, and is not to be used or relied on for any diagnostic or treatment purposes. Please consult your health care provider before making any healthcare decisions or for guidance about a specific medical condition. WikiMD expressly disclaims responsibility, and shall have no liability, for any damages, loss, injury, or liability whatsoever suffered as a result of your reliance on the information contained in this site. By visiting this site you agree to the foregoing terms and conditions, which may from time to time be changed or supplemented by WikiMD. If you do not agree to the foregoing terms and conditions, you should not enter or use this site. See full disclaimer.
Credits:Most images are courtesy of Wikimedia commons, and templates Wikipedia, licensed under CC BY SA or similar.

Contributors: Prab R. Tumpati, MD