Categories: Cryptography.

# Von Neumann extractor

The **Von Neumann extractor** is a simple example of a **randomness extractor**:
given a stream of “imperfectly random” bits,
it extracts the entropy, and outputs a “perfectly random” stream.

As input, the Von Neumann extractor expects a stream of independent (uncorrelated) bits, i.e. the result of a Bernoulli process, where each bit is $0$ with probability $p$, and $1$ with probability $1 \!-\! p$. Crucially, $p$ does not need to be $1/2$; there may be a bias.

The extractor will output a uniformly random stream with $p = 1/2$. Given input bits $a_1, a_2, ...$, it achieves this by looking at the bits in pairs $(a_1, a_2)$, $(a_3, a_4)$, etc. Then:

- If $a_n = a_{n+1}$, it discards both bits.
- If $a_n \neq a_{n+1}$, it keeps the first bit $a_n$, and discards $a_{n+1}$.

Evidently, the first case $a_n = a_{n+1}$ occurs with the following probabilities:

$\begin{aligned} P(0, 0) = p^2 \qquad \qquad P(1, 1) = (1 - p)^2 \end{aligned}$Meanwhile, the second case $a_n \neq a_{n+1}$ occurs with probabilities given by:

$\begin{aligned} P(0, 1) = p (p - 1) \qquad \qquad P(1, 0) = (p - 1) p \end{aligned}$Crucially, they are equal; $P(0, 1) = P(1, 0)$. Therefore, if the extractor encounters an input pair satisfying $a_n \neq a_{n+1}$, the first bit $a_n$ is $0$ or $1$ with a 50-50 probability, regardless of $p$. Since the extractor only keeps those bits, its output is guaranteed to be “perfectly random”.

Clearly, because it discards many of the bits, the output stream will have a length $N_\mathrm{out} < N_\mathrm{in}$. The exact value of $N_\mathrm{out}$ is as follows, where $P(0, 1) + P(1, 0)$ is the probability that we keep a bit, and the factor $1/2$ is due to us discarding half of the pair even in that case:

$\begin{aligned} N_\mathrm{out} = \frac{1}{2} N_\mathrm{in} \Big( P(0, 1) + P(1, 0) \Big) = \frac{1}{2} N_\mathrm{in} \Big( 2 p (p - 1) \Big) = N_\mathrm{in} p (p - 1) \end{aligned}$The key assumption that allows the Von Neumann extractor to work is that there is no correlation at all between the bits. In practice, this may be difficult to achieve, in which case a more complex randomness extraction scheme is needed.

## References

- J.B. Brask,
*Quantum information: lecture notes*, 2021, unpublished.