source/know/concept/binomial-distribution/index.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220

---
title: "Binomial distribution"
date: 2021-02-26
categories:
- Statistics
- Mathematics
layout: "concept"
---

The **binomial distribution** is a discrete probability distribution
describing a **Bernoulli process**: a set of independent $N$ trials where
each has only two possible outcomes, "success" and "failure",
the former with probability $p$ and the latter with $q = 1 - p$.
The binomial distribution then gives the probability
that $n$ out of the $N$ trials succeed:

$$\begin{aligned}
    \boxed{
        P_N(n) = \binom{N}{n} \: p^n q^{N - n}
    }
\end{aligned}$$

The first factor is known as the **binomial coefficient**, which describes the
number of microstates (i.e. permutations) that have $n$ successes out of $N$ trials.
These happen to be the coefficients in the polynomial $(a + b)^N$,
and can be read off of Pascal's triangle.
It is defined as follows:

$$\begin{aligned}
    \boxed{
        \binom{N}{n} = \frac{N!}{n! (N - n)!}
    }
\end{aligned}$$

The remaining factor $p^n (1 - p)^{N - n}$ is then just the
probability of attaining each microstate.

The expected or mean number of successes $\mu$ after $N$ trials is as follows:

$$\begin{aligned}
    \boxed{
        \mu = N p
    }
\end{aligned}$$

<div class="accordion">
<input type="checkbox" id="proof-mean"/>
<label for="proof-mean">Proof</label>
<div class="hidden">
<label for="proof-mean">Proof.</label>
The trick is to treat $p$ and $q$ as independent until the last moment:

$$\begin{aligned}
    \mu
    &= \sum_{n = 0}^N n \binom{N}{n} p^n q^{N - n}
    = \sum_{n = 0}^N \binom{N}{n} \Big( p \pdv{(p^n)}{p} \Big) q^{N - n}
    \\
    &= p \pdv{}{p}\sum_{n = 0}^N \binom{N}{n} p^n q^{N - n}
    = p \pdv{}{p}(p + q)^N
    = N p (p + q)^{N - 1}
\end{aligned}$$

Inserting $q = 1 - p$ then gives the desired result.
</div>
</div>

Meanwhile, we find the following variance $\sigma^2$,
with $\sigma$ being the standard deviation:

$$\begin{aligned}
    \boxed{
        \sigma^2 = N p q
    }
\end{aligned}$$

<div class="accordion">
<input type="checkbox" id="proof-var"/>
<label for="proof-var">Proof</label>
<div class="hidden">
<label for="proof-var">Proof.</label>
We use the same trick to calculate $\overline{n^2}$
(the mean squared number of successes):

$$\begin{aligned}
    \overline{n^2}
    &= \sum_{n = 0}^N n^2 \binom{N}{n} p^n q^{N - n}
    = \sum_{n = 0}^N n \binom{N}{n} \Big( p \pdv{}{p}\Big)^2 p^n q^{N - n}
    \\
    &= \Big( p \pdv{}{p}\Big)^2 \sum_{n = 0}^N \binom{N}{n} p^n q^{N - n}
    = \Big( p \pdv{}{p}\Big)^2 (p + q)^N
    \\
    &= N p \pdv{}{p}p (p + q)^{N - 1}
    = N p \big( (p + q)^{N - 1} + (N - 1) p (p + q)^{N - 2} \big)
    \\
    &= N p + N^2 p^2 - N p^2
\end{aligned}$$

Using this and the earlier expression $\mu = N p$, we find the variance $\sigma^2$:

$$\begin{aligned}
    \sigma^2
    &= \overline{n^2} - \mu^2
    = N p + N^2 p^2 - N p^2 - N^2 p^2
    = N p (1 - p)
\end{aligned}$$

By inserting $q = 1 - p$, we arrive at the desired expression.
</div>
</div>

As $N \to \infty$, the binomial distribution
turns into the continuous normal distribution,
a fact that is sometimes called the **de Moivre-Laplace theorem**:

$$\begin{aligned}
    \boxed{
        \lim_{N \to \infty} P_N(n) = \frac{1}{\sqrt{2 \pi \sigma^2}} \exp\!\Big(\!-\!\frac{(n - \mu)^2}{2 \sigma^2} \Big)
    }
\end{aligned}$$

<div class="accordion">
<input type="checkbox" id="proof-normal"/>
<label for="proof-normal">Proof</label>
<div class="hidden">
<label for="proof-normal">Proof.</label>
We take the Taylor expansion of $\ln\!\big(P_N(n)\big)$
around the mean $\mu = Np$:

$$\begin{aligned}
    \ln\!\big(P_N(n)\big)
    &= \sum_{m = 0}^\infty \frac{(n - \mu)^m}{m!} D_m(\mu)
    \quad \mathrm{where} \quad
    D_m(n) = \dvn{m}{\ln\!\big(P_N(n)\big)}{n}
\end{aligned}$$

We use Stirling's approximation to calculate the factorials in $D_m$:

$$\begin{aligned}
    \ln\!\big(P_N(n)\big)
    &= \ln(N!) - \ln(n!) - \ln\!\big((N - n)!\big) + n \ln(p) + (N - n) \ln(q)
    \\
    &\approx \ln(N!) - n \big( \ln(n)\!-\!\ln(p)\!-\!1 \big) - (N\!-\!n) \big( \ln(N\!-\!n)\!-\!\ln(q)\!-\!1 \big)
\end{aligned}$$

For $D_0(\mu)$, we need to use a stronger version of Stirling's approximation
to get a non-zero result. We take advantage of $N - N p = N q$:

$$\begin{aligned}
    D_0(\mu)
    &= \ln(N!) - \ln\!\big((N p)!\big) - \ln\!\big((N q)!\big) + N p \ln(p) + N q \ln(q)
    \\
    &= \Big( N \ln(N) - N + \frac{1}{2} \ln(2\pi N) \Big)
    - \Big( N p \ln(N p) - N p + \frac{1}{2} \ln(2\pi N p) \Big) \\
    &\qquad - \Big( N q \ln(N q) - N q + \frac{1}{2} \ln(2\pi N q) \Big)
    + N p \ln(p) + N q \ln(q)
    \\
    &= N \ln(N) - N (p + q) \ln(N) + N (p + q) - N - \frac{1}{2} \ln(2\pi N p q)
    \\
    &= - \frac{1}{2} \ln(2\pi N p q)
    = \ln\!\Big( \frac{1}{\sqrt{2\pi \sigma^2}} \Big)
\end{aligned}$$

Next, we expect that $D_1(\mu) = 0$, because $\mu$ is the maximum.
This is indeed the case:

$$\begin{aligned}
    D_1(n)
    &= - \big( \ln(n)\!-\!\ln(p)\!-\!1 \big) + \big( \ln(N\!-\!n)\!-\!\ln(q)\!-\!1 \big) - 1 + 1
    \\
    &= - \ln(n) + \ln(N - n) + \ln(p) - \ln(q)
    \\
    D_1(\mu)
    &= \ln(N q) - \ln(N p) + \ln(p) - \ln(q)
    = \ln(N p q) - \ln(N p q)
    = 0
\end{aligned}$$

For the same reason, we expect that $D_2(\mu)$ is negative.
We find the following expression:

$$\begin{aligned}
    D_2(n)
    &= - \frac{1}{n} - \frac{1}{N - n}
    \qquad
    D_2(\mu)
    = - \frac{1}{Np} - \frac{1}{Nq}
    = - \frac{p + q}{N p q}
    = - \frac{1}{\sigma^2}
\end{aligned}$$

The higher-order derivatives tend to zero for $N \to \infty$, so we discard them:

$$\begin{aligned}
    D_3(n)
    = \frac{1}{n^2} - \frac{1}{(N - n)^2}
    \qquad
    D_4(n)
    = - \frac{2}{n^3} - \frac{2}{(N - n)^3}
    \qquad
    \cdots
\end{aligned}$$

Putting everything together, for large $N$,
the Taylor series approximately becomes:

$$\begin{aligned}
    \ln\!\big(P_N(n)\big)
    \approx D_0(\mu) + \frac{(n - \mu)^2}{2} D_2(\mu)
    = \ln\!\Big( \frac{1}{\sqrt{2\pi \sigma^2}} \Big) - \frac{(n - \mu)^2}{2 \sigma^2}
\end{aligned}$$

Taking $\exp$ of this expression then yields a normalized Gaussian distribution.
</div>
</div>


## References
1.  H. Gould, J. Tobochnik,
    *Statistical and thermal physics*, 2nd edition,
    Princeton.