Kullback–Leibler divergence

From WikiMD's Wellness Encyclopedia

Kullback–Leibler divergence (KLD), also known as relative entropy, is a measure of how one probability distribution diverges from a second, expected probability distribution. Applications of the KLD can be found across various fields such as information theory, machine learning, statistics, and data science. It quantifies the amount of information lost when one distribution is used to approximate another.

Definition[edit | edit source]

The Kullback–Leibler divergence from a true probability distribution P to a different probability distribution Q over the same probability space Ω is defined as:

DKL(P‖Q) = ∑x∈Ω P(x) loge (P(x) / Q(x))

for discrete probability distributions, and

DKL(P‖Q) = ∫Ω p(x) log (p(x) / q(x)) dx

for continuous probability distributions, where p and q denote the probability density functions of P and Q, respectively.

Properties[edit | edit source]

  • Non-negativity: The KLD is always non-negative, DKL(P‖Q) ≥ 0, and is zero if and only if P and Q are the same distribution in the case of discrete variables or almost everywhere in the case of continuous variables.
  • Asymmetry: It is not symmetric, meaning DKL(P‖Q) ≠ DKL(Q‖P).
  • Not a true metric: Because of its asymmetry and the fact that it does not satisfy the triangle inequality, KLD is not a true metric.

Applications[edit | edit source]

Kullback–Leibler divergence has a wide range of applications:

  • In Information Theory, it measures the information gained when revising one's beliefs from the prior distribution Q to the posterior distribution P.
  • In Machine Learning and Statistics, KLD is used for various tasks such as clustering, dimensionality reduction, and as a loss function in neural networks.
  • In Data Science, it helps in comparing the similarity between two data distributions, which is useful in anomaly detection and other predictive modeling tasks.

See also[edit | edit source]

References[edit | edit source]


Contributors: Prab R. Tumpati, MD