Categories: Mathematics, Statistics.

Binomial distribution

The binomial distribution is a discrete probability distribution describing a Bernoulli process: a set of independent \(N\) trials where each has only two possible outcomes, “success” and “failure”, the former with probability \(p\) and the latter with \(q = 1 - p\). The binomial distribution then gives the probability that \(n\) out of the \(N\) trials succeed:

\[\begin{aligned} \boxed{ P_N(n) = \binom{N}{n} \: p^n (1 - p)^{N - n} } \end{aligned}\]

The first factor is known as the binomial coefficient, which describes the number of microstates (i.e. permutations) that have \(n\) successes out of \(N\) trials. These happen to be the coefficients in the polynomial \((a + b)^N\), and can be read off of Pascal’s triangle. It is defined as follows:

\[\begin{aligned} \boxed{ \binom{N}{n} = \frac{N!}{n! (N - n)!} } \end{aligned}\]

The remaining factor \(p^n (1 - p)^{N - n}\) is then just the probability of attaining each microstate.

To find the mean number of successes \(\mu\), the trick is to treat \(p\) and \(q\) as independent:

\[\begin{aligned} \mu &= \sum_{n = 0}^N n \binom{N}{n} p^n q^{N - n} = \sum_{n = 0}^N \binom{N}{n} \Big( p \pdv{(p^n)}{p} \Big) q^{N - n} \\ &= p \pdv{p} \sum_{n = 0}^N \binom{N}{n} p^n q^{N - n} = p \pdv{p} (p + q)^N = N p (p + q)^{N - 1} \end{aligned}\]

By inserting \(q = 1 - p\), we find the following expression for the mean:

\[\begin{aligned} \boxed{ \mu = N p } \end{aligned}\]

Next, we use the same trick to calculate \(\overline{n^2}\) (the mean of the squared number of successes):

\[\begin{aligned} \overline{n^2} &= \sum_{n = 0}^N n^2 \binom{N}{n} p^n q^{N - n} = \sum_{n = 0}^N n \binom{N}{n} \Big( p \pdv{p} \Big)^2 p^n q^{N - n} \\ &= \Big( p \pdv{p} \Big)^2 \sum_{n = 0}^N \binom{N}{n} p^n q^{N - n} = \Big( p \pdv{p} \Big)^2 (p + q)^N \\ &= N p \pdv{p} p (p + q)^{N - 1} = N p \big( (p + q)^{N - 1} + (N - 1) p (p + q)^{N - 2} \big) \\ &= N p + N^2 p^2 - N p^2 \end{aligned}\]

Using this and the earlier expression for \(\mu\), we find the variance \(\sigma^2\):

\[\begin{aligned} \sigma^2 &= \overline{n^2} - \mu^2 = N p + N^2 p^2 - N p^2 - N^2 p^2 = N p (1 - p) \end{aligned}\]

Once again, by inserting \(q = 1 - p\), we find the following expression for the variance:

\[\begin{aligned} \boxed{ \sigma^2 = N p q } \end{aligned}\]

As \(N\) grows to infinity, the binomial distribution turns into the continuous normal distribution. We demonstrate this by taking the Taylor expansion of its natural logarithm \(\ln\!\big(P_N(n)\big)\) around the mean \(\mu = Np\):

\[\begin{aligned} \ln\!\big(P_N(n)\big) &= \sum_{m = 0}^\infty \frac{(n - \mu)^m}{m!} D_m(\mu) \quad \mathrm{where} \quad D_m(n) = \dv[m]{\ln\!\big(P_N(n)\big)}{n} \end{aligned}\]

We use Stirling’s approximation to calculate all these factorials:

\[\begin{aligned} \ln\!\big(P_N(n)\big) &= \ln(N!) - \ln(n!) - \ln\!\big((N - n)!\big) + n \ln(p) + (N - n) \ln(q) \\ &\approx \ln(N!) - n \big( \ln(n)\!-\!\ln(p)\!-\!1 \big) - (N\!-\!n) \big( \ln(N\!-\!n)\!-\!\ln(q)\!-\!1 \big) \end{aligned}\]

For \(D_0(\mu)\), we need to use a stronger version of Stirling’s approximation to get a non-zero result. We take advantage of \(N - N p = N q\):

\[\begin{aligned} D_0(\mu) &= \ln(N!) - \ln\!\big((N p)!\big) - \ln\!\big((N q)!\big) + N p \ln(p) + N q \ln(q) \\ &= \Big( N \ln(N) - N + \frac{1}{2} \ln(2\pi N) \Big) - \Big( N p \ln(N p) - N p + \frac{1}{2} \ln(2\pi N p) \Big) \\ &\qquad - \Big( N q \ln(N q) - N q + \frac{1}{2} \ln(2\pi N q) \Big) + N p \ln(p) + N q \ln(q) \\ &= N \ln(N) - N (p + q) \ln(N) + N (p + q) - N - \frac{1}{2} \ln(2\pi N p q) \\ &= - \frac{1}{2} \ln(2\pi N p q) = \ln\!\Big( \frac{1}{\sqrt{2\pi \sigma^2}} \Big) \end{aligned}\]

Next, we expect that \(D_1(\mu) = 0\), because \(\mu\) is the maximum. This is indeed the case:

\[\begin{aligned} D_1(n) &= - \big( \ln(n)\!-\!\ln(p)\!-\!1 \big) + \big( \ln(N\!-\!n)\!-\!\ln(q)\!-\!1 \big) - 1 + 1 \\ &= - \ln(n) + \ln(N - n) + \ln(p) - \ln(q) \\ D_1(\mu) &= \ln(N q) - \ln(N p) + \ln(p) - \ln(q) = \ln(N p q) - \ln(N p q) = 0 \end{aligned}\]

For the same reason, we expect that \(D_2(\mu)\) is negative We find the following expression:

\[\begin{aligned} D_2(n) &= - \frac{1}{n} - \frac{1}{N - n} \qquad D_2(\mu) = - \frac{1}{Np} - \frac{1}{Nq} = - \frac{p + q}{N p q} = - \frac{1}{\sigma^2} \end{aligned}\]

The higher-order derivatives tend to zero for large \(N\), so we discard them:

\[\begin{aligned} D_3(n) = \frac{1}{n^2} - \frac{1}{(N - n)^2} \qquad D_4(n) = - \frac{2}{n^3} - \frac{2}{(N - n)^3} \qquad \cdots \end{aligned}\]

Putting everything together, for large \(N\), the Taylor series approximately becomes:

\[\begin{aligned} \ln\!\big(P_N(n)\big) \approx D_0(\mu) + \frac{(n - \mu)^2}{2} D_2(\mu) = \ln\!\Big( \frac{1}{\sqrt{2\pi \sigma^2}} \Big) - \frac{(n - \mu)^2}{2 \sigma^2} \end{aligned}\]

Thus, as \(N\) goes to infinity, the binomial distribution becomes a Gaussian:

\[\begin{aligned} \boxed{ \lim_{N \to \infty} P_N(n) = \frac{1}{\sqrt{2 \pi \sigma^2}} \exp\!\Big(\!-\!\frac{(n - \mu)^2}{2 \sigma^2} \Big) } \end{aligned}\]

References

  1. H. Gould, J. Tobochnik, Statistical and thermal physics, 2nd edition, Princeton.

© "Prefetch". Licensed under CC BY-SA 4.0.
uses