Kolmogorov–Smirnov test
Kolmogorov–Smirnov test (K–S test) is a nonparametric test used in statistics to compare a sample with a reference probability distribution (one-sample K–S test), or to compare two samples (two-sample K–S test). It is named after Andrey Kolmogorov and Nikolai Smirnov. The K–S test quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution, or between the empirical distribution functions of two samples. The null hypothesis of the test is that the sample comes from the same distribution as the reference distribution (in the one-sample case), or that the two samples come from the same distribution (in the two-sample case).
Definition[edit | edit source]
The Kolmogorov–Smirnov statistic quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution, or between the empirical distribution functions of two samples. For a one-sample K–S test, the test statistic is:
\[D_n = \sup_x |F_n(x) - F(x)|\]
where \(F_n(x)\) is the empirical distribution function of the sample and \(F(x)\) is the cumulative distribution function of the reference distribution. For a two-sample K–S test, the test statistic is:
\[D_{n,m} = \sup_x |F_{n}(x) - G_{m}(x)|\]
where \(F_{n}(x)\) and \(G_{m}(x)\) are the empirical distribution functions of the two samples, with \(n\) and \(m\) being the sizes of the samples, respectively.
Applications[edit | edit source]
The K–S test is widely used in situations where the form of the distribution is not known and for comparing the goodness-of-fit of empirical data to a theoretical model. It is particularly useful in the fields of statistics, economics, psychology, and environmental science, among others.
Advantages and Limitations[edit | edit source]
One of the main advantages of the K–S test is its nonparametric nature, meaning it does not assume a specific distribution for the data. However, the test has less power than some alternatives, such as the Anderson-Darling test, especially for small sample sizes or when the differences between distributions are in the tails.
Implementation[edit | edit source]
The K–S test has been implemented in various statistical software packages, including R, Python's SciPy library, and MATLAB. These implementations typically provide functions to perform both one-sample and two-sample tests, along with options to adjust for the effect of discrete data or ties.
See Also[edit | edit source]
- Nonparametric statistics
- Cumulative distribution function
- Empirical distribution function
- Anderson-Darling test
Search WikiMD
Ad.Tired of being Overweight? Try W8MD's physician weight loss program.
Semaglutide (Ozempic / Wegovy and Tirzepatide (Mounjaro / Zepbound) available.
Advertise on WikiMD
WikiMD's Wellness Encyclopedia |
Let Food Be Thy Medicine Medicine Thy Food - Hippocrates |
Translate this page: - East Asian
中文,
日本,
한국어,
South Asian
हिन्दी,
தமிழ்,
తెలుగు,
Urdu,
ಕನ್ನಡ,
Southeast Asian
Indonesian,
Vietnamese,
Thai,
မြန်မာဘာသာ,
বাংলা
European
español,
Deutsch,
français,
Greek,
português do Brasil,
polski,
română,
русский,
Nederlands,
norsk,
svenska,
suomi,
Italian
Middle Eastern & African
عربى,
Turkish,
Persian,
Hebrew,
Afrikaans,
isiZulu,
Kiswahili,
Other
Bulgarian,
Hungarian,
Czech,
Swedish,
മലയാളം,
मराठी,
ਪੰਜਾਬੀ,
ગુજરાતી,
Portuguese,
Ukrainian
WikiMD is not a substitute for professional medical advice. See full disclaimer.
Credits:Most images are courtesy of Wikimedia commons, and templates Wikipedia, licensed under CC BY SA or similar.
Contributors: Prab R. Tumpati, MD