Bhattacharyya distance

From WikiMD's Wellness Encyclopedia

Bhattacharyya distance is a measure of divergence between two probability distributions. It is widely used in various fields such as statistics, pattern recognition, and machine learning to quantify the similarity or dissimilarity between two statistical samples or populations. The Bhattacharyya distance is named after the Indian statistician Anil Kumar Bhattacharyya, who introduced this concept in 1943.

Definition[edit | edit source]

Given two discrete or continuous probability distributions \(P\) and \(Q\) over the same domain \(X\), the Bhattacharyya distance \(D_B\) is defined as:

\[D_B(P, Q) = -\ln(\sum_{x \in X} \sqrt{p(x)q(x)})\]

for discrete distributions, or

\[D_B(P, Q) = -\ln(\int_{X} \sqrt{p(x)q(x)} dx)\]

for continuous distributions, where \(p(x)\) and \(q(x)\) are the probability density functions of \(P\) and \(Q\), respectively.

Properties[edit | edit source]

The Bhattacharyya distance has several important properties:

  • It is non-negative, i.e., \(D_B(P, Q) \geq 0\).
  • It is symmetric, meaning \(D_B(P, Q) = D_B(Q, P)\).
  • \(D_B(P, Q) = 0\) if and only if \(P = Q\), indicating that the two distributions are identical.

However, it is important to note that the Bhattacharyya distance is not a true metric since it does not satisfy the triangle inequality.

Applications[edit | edit source]

The Bhattacharyya distance is used in various applications, including:

  • Feature selection: In pattern recognition and machine learning, it can be used to select features that maximize the distance between classes, thereby improving classification performance.
  • Image processing: It is applied in image segmentation and registration to measure the similarity between histograms of different images or image regions.
  • Statistical analysis: In statistics, it provides a way to measure the divergence between two distributions, which can be useful in hypothesis testing and data analysis.

Example[edit | edit source]

Consider two Gaussian distributions \(P \sim N(\mu_1, \sigma_1^2)\) and \(Q \sim N(\mu_2, \sigma_2^2)\). The Bhattacharyya distance between these two distributions can be explicitly calculated as:

\[D_B(P, Q) = \frac{1}{4} \ln\left( \frac{1}{4} \left( \frac{\sigma_1^2}{\sigma_2^2} + \frac{\sigma_2^2}{\sigma_1^2} + 2 \right) \right) + \frac{1}{4} \left( \frac{(\mu_1 - \mu_2)^2}{\sigma_1^2 + \sigma_2^2} \right)\]

This formula provides a concrete example of how the Bhattacharyya distance can be computed for specific distributions.

See Also[edit | edit source]

External Links[edit | edit source]

WikiMD
Navigation: Wellness - Encyclopedia - Health topics - Disease Index‏‎ - Drugs - World Directory - Gray's Anatomy - Keto diet - Recipes

Search WikiMD

Ad.Tired of being Overweight? Try W8MD's physician weight loss program.
Semaglutide (Ozempic / Wegovy and Tirzepatide (Mounjaro / Zepbound) available.
Advertise on WikiMD

WikiMD's Wellness Encyclopedia

Let Food Be Thy Medicine
Medicine Thy Food - Hippocrates

WikiMD is not a substitute for professional medical advice. See full disclaimer.
Credits:Most images are courtesy of Wikimedia commons, and templates Wikipedia, licensed under CC BY SA or similar.

Contributors: Prab R. Tumpati, MD