Categories: Mathematics, Statistics.

The **binomial distribution** is a discrete probability distribution describing a **Bernoulli process**: a set of independent \(N\) trials where each has only two possible outcomes, “success” and “failure”, the former with probability \(p\) and the latter with \(q = 1 - p\). The binomial distribution then gives the probability that \(n\) out of the \(N\) trials succeed:

\[\begin{aligned} \boxed{ P_N(n) = \binom{N}{n} \: p^n (1 - p)^{N - n} } \end{aligned}\]

The first factor is known as the **binomial coefficient**, which describes the number of microstates (i.e. permutations) that have \(n\) successes out of \(N\) trials. These happen to be the coefficients in the polynomial \((a + b)^N\), and can be read off of Pascal’s triangle. It is defined as follows:

\[\begin{aligned} \boxed{ \binom{N}{n} = \frac{N!}{n! (N - n)!} } \end{aligned}\]

The remaining factor \(p^n (1 - p)^{N - n}\) is then just the probability of attaining each microstate.

To find the mean number of successes \(\mu\), the trick is to treat \(p\) and \(q\) as independent:

\[\begin{aligned} \mu &= \sum_{n = 0}^N n \binom{N}{n} p^n q^{N - n} = \sum_{n = 0}^N \binom{N}{n} \Big( p \pdv{(p^n)}{p} \Big) q^{N - n} \\ &= p \pdv{p} \sum_{n = 0}^N \binom{N}{n} p^n q^{N - n} = p \pdv{p} (p + q)^N = N p (p + q)^{N - 1} \end{aligned}\]

By inserting \(q = 1 - p\), we find the following expression for the mean:

\[\begin{aligned} \boxed{ \mu = N p } \end{aligned}\]

Next, we use the same trick to calculate \(\overline{n^2}\) (the mean of the squared number of successes):

\[\begin{aligned} \overline{n^2} &= \sum_{n = 0}^N n^2 \binom{N}{n} p^n q^{N - n} = \sum_{n = 0}^N n \binom{N}{n} \Big( p \pdv{p} \Big)^2 p^n q^{N - n} \\ &= \Big( p \pdv{p} \Big)^2 \sum_{n = 0}^N \binom{N}{n} p^n q^{N - n} = \Big( p \pdv{p} \Big)^2 (p + q)^N \\ &= N p \pdv{p} p (p + q)^{N - 1} = N p \big( (p + q)^{N - 1} + (N - 1) p (p + q)^{N - 2} \big) \\ &= N p + N^2 p^2 - N p^2 \end{aligned}\]

Using this and the earlier expression for \(\mu\), we find the variance \(\sigma^2\):

\[\begin{aligned} \sigma^2 &= \overline{n^2} - \mu^2 = N p + N^2 p^2 - N p^2 - N^2 p^2 = N p (1 - p) \end{aligned}\]

Once again, by inserting \(q = 1 - p\), we find the following expression for the variance:

\[\begin{aligned} \boxed{ \sigma^2 = N p q } \end{aligned}\]

As \(N\) grows to infinity, the binomial distribution turns into the continuous normal distribution. We demonstrate this by taking the Taylor expansion of its natural logarithm \(\ln\!\big(P_N(n)\big)\) around the mean \(\mu = Np\):

\[\begin{aligned} \ln\!\big(P_N(n)\big) &= \sum_{m = 0}^\infty \frac{(n - \mu)^m}{m!} D_m(\mu) \quad \mathrm{where} \quad D_m(n) = \dv[m]{\ln\!\big(P_N(n)\big)}{n} \end{aligned}\]

We use Stirling’s approximation to calculate all these factorials:

\[\begin{aligned} \ln\!\big(P_N(n)\big) &= \ln(N!) - \ln(n!) - \ln\!\big((N - n)!\big) + n \ln(p) + (N - n) \ln(q) \\ &\approx \ln(N!) - n \big( \ln(n)\!-\!\ln(p)\!-\!1 \big) - (N\!-\!n) \big( \ln(N\!-\!n)\!-\!\ln(q)\!-\!1 \big) \end{aligned}\]

For \(D_0(\mu)\), we need to use a stronger version of Stirling’s approximation to get a non-zero result. We take advantage of \(N - N p = N q\):

\[\begin{aligned} D_0(\mu) &= \ln(N!) - \ln\!\big((N p)!\big) - \ln\!\big((N q)!\big) + N p \ln(p) + N q \ln(q) \\ &= \Big( N \ln(N) - N + \frac{1}{2} \ln(2\pi N) \Big) - \Big( N p \ln(N p) - N p + \frac{1}{2} \ln(2\pi N p) \Big) \\ &\qquad - \Big( N q \ln(N q) - N q + \frac{1}{2} \ln(2\pi N q) \Big) + N p \ln(p) + N q \ln(q) \\ &= N \ln(N) - N (p + q) \ln(N) + N (p + q) - N - \frac{1}{2} \ln(2\pi N p q) \\ &= - \frac{1}{2} \ln(2\pi N p q) = \ln\!\Big( \frac{1}{\sqrt{2\pi \sigma^2}} \Big) \end{aligned}\]

Next, we expect that \(D_1(\mu) = 0\), because \(\mu\) is the maximum. This is indeed the case:

\[\begin{aligned} D_1(n) &= - \big( \ln(n)\!-\!\ln(p)\!-\!1 \big) + \big( \ln(N\!-\!n)\!-\!\ln(q)\!-\!1 \big) - 1 + 1 \\ &= - \ln(n) + \ln(N - n) + \ln(p) - \ln(q) \\ D_1(\mu) &= \ln(N q) - \ln(N p) + \ln(p) - \ln(q) = \ln(N p q) - \ln(N p q) = 0 \end{aligned}\]

For the same reason, we expect that \(D_2(\mu)\) is negative We find the following expression:

\[\begin{aligned} D_2(n) &= - \frac{1}{n} - \frac{1}{N - n} \qquad D_2(\mu) = - \frac{1}{Np} - \frac{1}{Nq} = - \frac{p + q}{N p q} = - \frac{1}{\sigma^2} \end{aligned}\]

The higher-order derivatives tend to zero for large \(N\), so we discard them:

\[\begin{aligned} D_3(n) = \frac{1}{n^2} - \frac{1}{(N - n)^2} \qquad D_4(n) = - \frac{2}{n^3} - \frac{2}{(N - n)^3} \qquad \cdots \end{aligned}\]

Putting everything together, for large \(N\), the Taylor series approximately becomes:

\[\begin{aligned} \ln\!\big(P_N(n)\big) \approx D_0(\mu) + \frac{(n - \mu)^2}{2} D_2(\mu) = \ln\!\Big( \frac{1}{\sqrt{2\pi \sigma^2}} \Big) - \frac{(n - \mu)^2}{2 \sigma^2} \end{aligned}\]

Thus, as \(N\) goes to infinity, the binomial distribution becomes a Gaussian:

\[\begin{aligned} \boxed{ \lim_{N \to \infty} P_N(n) = \frac{1}{\sqrt{2 \pi \sigma^2}} \exp\!\Big(\!-\!\frac{(n - \mu)^2}{2 \sigma^2} \Big) } \end{aligned}\]

- H. Gould, J. Tobochnik,
*Statistical and thermal physics*, 2nd edition, Princeton.

© "Prefetch". Licensed under CC BY-SA 4.0.