Negative binomial distribution

From WikiMD's Wellness Encyclopedia

Thomae's function like distribution

Negative binomial distribution is a probability distribution used in statistics for modeling the number of successes in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of failures occurs. The distribution is also known as the Pascal distribution or Polya distribution. It generalizes the geometric distribution and is useful in various real-world scenarios, such as modeling over-dispersed count data, where the variance exceeds the mean.

Definition[edit | edit source]

The negative binomial distribution can be defined using two parameters: \(r\) and \(p\). Here, \(r\) represents the number of failures until the experiment is stopped, and \(p\) is the probability of success on each trial. The probability mass function (PMF) of the negative binomial distribution for \(k\) successes is given by:

\[ P(X = k) = \binom{k+r-1}{k} (1-p)^r p^k \]

where \(k\) is the number of successes, \(r > 0\), and \(0 < p < 1\). The function \(\binom{k+r-1}{k}\) is a binomial coefficient.

Properties[edit | edit source]

Mean[edit | edit source]

The mean, or expected value, of a negative binomial distribution is given by:

\[ \mu = \frac{rp}{1-p} \]

Variance[edit | edit source]

The variance of the distribution is:

\[ \sigma^2 = \frac{rp}{(1-p)^2} \]

This shows that the variance is greater than the mean, which is a characteristic of over-dispersed data.

Relation to Other Distributions[edit | edit source]

The negative binomial distribution includes other distributions as special cases. For \(r=1\), it simplifies to the geometric distribution, which models the number of Bernoulli trials needed to get one success. It also approximates the Poisson distribution for a large number of trials with a small probability of success.

Applications[edit | edit source]

The negative binomial distribution is widely used in various fields such as biology, ecology, and insurance. In biology, it can model the number of trials until a certain number of mutations occur. In ecology, it is used to model the distribution of species abundance. In insurance, it can model the number of claims or losses until a certain threshold.

See Also[edit | edit source]

This article is a stub.

You can help WikiMD by registering to expand it.
Editing is available only to registered and verified users.
WikiMD is a comprehensive, free health & wellness encyclopedia.

Contributors: Prab R. Tumpati, MD