Improve knowledge base

author: Prefetch 2023-02-20 18:08:31 +0100
committer: Prefetch 2023-02-20 18:08:31 +0100
commit: 75636ed8772512bdf38e3dec431888837eaddc5d (patch)
tree: 4beef6d770f1745b3667b3f9816987a29a79d42a
parent: 7f65c526132ee98d59d1a2b53d08c4b49330af03 (diff)
4 files changed, 365 insertions, 297 deletions
diff --git a/source/know/concept/bells-theorem/index.md b/source/know/concept/bells-theorem/index.md
index a01bf9e..1589a7a 100644
--- a/source/know/concept/bells-theorem/index.md
+++ b/source/know/concept/bells-theorem/index.md
@@ -17,13 +17,13 @@ Suppose that we have two spin-1/2 particles, called $$A$$ and $$B$$,
 in an entangled [Bell state](/know/concept/bell-state/):
 
 $$\begin{aligned}
-    \Ket{\Psi^{-}}
-    = \frac{1}{\sqrt{2}} \Big( \Ket{\uparrow \downarrow} - \Ket{\downarrow \uparrow} \Big)
+    \ket{\Psi^{-}}
+    = \frac{1}{\sqrt{2}} \Big( \ket{\uparrow \downarrow} - \ket{\downarrow \uparrow} \Big)
 \end{aligned}$$
 
 Since they are entangled,
-if we measure the $$z$$-spin of particle $$A$$, and find e.g. $$\Ket{\uparrow}$$,
-then particle $$B$$ immediately takes the opposite state $$\Ket{\downarrow}$$.
+if we measure the $$z$$-spin of particle $$A$$, and find e.g. $$\ket{\uparrow}$$,
+then particle $$B$$ immediately takes the opposite state $$\ket{\downarrow}$$.
 The point is that this collapse is instant,
 regardless of the distance between $$A$$ and $$B$$.
 
@@ -69,21 +69,29 @@ $$\begin{aligned}
 \end{aligned}$$
 
 The product of the outcomes of $$A$$ and $$B$$ then has the following expectation value.
-Note that we only multiply $$A$$ and $$B$$ for shared $$\lambda$$-values:
-this is what makes it a **local** hidden variable:
+Note that we multiply $$A$$ and $$B$$ at the same $$\lambda$$-value,
+hence it is a *local* hidden variable:
 
 $$\begin{aligned}
-    \Expval{A_a B_b}
-    = \int \rho(\lambda) \: A(\vec{a}, \lambda) \: B(\vec{b}, \lambda) \dd{\lambda}
+    \expval{A_a B_b}
+    \equiv \int \rho(\lambda) \: A(\vec{a}, \lambda) \: B(\vec{b}, \lambda) \dd{\lambda}
 \end{aligned}$$
 
-From this, two inequalities can be derived,
-which both prove Bell's theorem.
+From this, we can make several predictions about LHV theories,
+which turn out to disagree with various theoretical
+and experimental results in quantum mechanics.
+The two most famous LHV predictions are
+the **Bell inequality** and
+the [CHSH inequality](/know/concept/chsh-inequality/).
+
 
 
 ## Bell inequality
 
-If $$\vec{a} = \vec{b}$$, then we know that $$A$$ and $$B$$ always have opposite spins:
+We present Bell's original proof of his theorem.
+If $$\vec{a} = \vec{b}$$, then we know that
+measuring $$A$$ and $$B$$ gives them opposite spins,
+because they start in the entangled state $$\ket{\Psi^{-}}$$:
 
 $$\begin{aligned}
     A(\vec{a}, \lambda)
@@ -94,7 +102,7 @@ $$\begin{aligned}
 The expectation value of the product can therefore be rewritten as follows:
 
 $$\begin{aligned}
-    \Expval{A_a B_b}
+    \expval{A_a B_b}
     = - \int \rho(\lambda) \: A(\vec{a}, \lambda) \: A(\vec{b}, \lambda) \dd{\lambda}
 \end{aligned}$$
 
@@ -102,7 +110,7 @@ Next, we introduce an arbitrary third direction $$\vec{c}$$,
 and use the fact that $$( A(\vec{b}, \lambda) )^2 = 1$$:
 
 $$\begin{aligned}
-    \Expval{A_a B_b} - \Expval{A_a B_c}
+    \expval{A_a B_b} - \expval{A_a B_c}
     &= - \int \rho(\lambda) \Big( A(\vec{a}, \lambda) \: A(\vec{b}, \lambda) - A(\vec{a}, \lambda) \: A(\vec{c}, \lambda) \Big) \dd{\lambda}
     \\
     &= - \int \rho(\lambda) \Big( 1 - A(\vec{b}, \lambda) \: A(\vec{c}, \lambda) \Big) A(\vec{a}, \lambda) \: A(\vec{b}, \lambda) \dd{\lambda}
@@ -114,7 +122,7 @@ Taking the absolute value of the whole left,
 and of the integrand on the right, we thus get:
 
 $$\begin{aligned}
-    \Big| \Expval{A_a B_b} - \Expval{A_a B_c} \Big|
+    \Big| \expval{A_a B_b} - \expval{A_a B_c} \Big|
     &\le \int \rho(\lambda) \Big( 1 - A(\vec{b}, \lambda) \: A(\vec{c}, \lambda) \Big)
     \: \Big| A(\vec{a}, \lambda) \: A(\vec{b}, \lambda) \Big| \dd{\lambda}
     \\
@@ -122,24 +130,24 @@ $$\begin{aligned}
 \end{aligned}$$
 
 Since $$\rho(\lambda)$$ is a normalized probability density function,
-we arrive at the **Bell inequality**:
+we arrive at the Bell inequality:
 
 $$\begin{aligned}
     \boxed{
-        \Big| \Expval{A_a B_b} - \Expval{A_a B_c} \Big|
-        \le 1 + \Expval{A_b B_c}
+        \Big| \expval{A_a B_b} - \expval{A_a B_c} \Big|
+        \le 1 + \expval{A_b B_c}
     }
 \end{aligned}$$
 
 Any theory involving an LHV $$\lambda$$ must obey this inequality.
-The problem, however, is that quantum mechanics dictates the expectation values
-for the state $$\Ket{\Psi^{-}}$$:
+The problem, however, is that quantum mechanics dictates
+the expectation values for the state $$\ket{\Psi^{-}}$$:
 
 $$\begin{aligned}
-    \Expval{A_a B_b} = - \vec{a} \cdot \vec{b}
+    \expval{A_a B_b} = - \vec{a} \cdot \vec{b}
 \end{aligned}$$
 
-Finding directions which violate the Bell inequality is easy:
+Finding directions that violate the Bell inequality is easy:
 for example, if $$\vec{a}$$ and $$\vec{b}$$ are orthogonal,
 and $$\vec{c}$$ is at a $$\pi/4$$ angle to both of them,
 then the left becomes $$0.707$$ and the right $$0.293$$,
@@ -147,222 +155,6 @@ which clearly disagrees with the inequality,
 meaning that LHVs are impossible.
 
 
-## CHSH inequality
-
-The **Clauser-Horne-Shimony-Holt** or simply **CHSH inequality**
-takes a slightly different approach, and is more useful in practice.
-
-Consider four spin directions, two for $$A$$ called $$\vec{a}_1$$ and $$\vec{a}_2$$,
-and two for $$B$$ called $$\vec{b}_1$$ and $$\vec{b}_2$$.
-Let us introduce the following abbreviations:
-
-$$\begin{aligned}
-    A_1 &= A(\vec{a}_1, \lambda)
-    \qquad \quad
-    A_2 = A(\vec{a}_2, \lambda)
-    \\
-    B_1 &= B(\vec{b}_1, \lambda)
-    \qquad \quad
-    B_2 = B(\vec{b}_2, \lambda)
-\end{aligned}$$
-
-From the definition of the expectation value,
-we know that the difference is given by:
-
-$$\begin{aligned}
-    \Expval{A_1 B_1} - \Expval{A_1 B_2}
-    = \int \rho(\lambda) \Big( A_1 B_1 - A_1 B_2 \Big) \dd{\lambda}
-\end{aligned}$$
-
-We introduce some new terms and rearrange the resulting expression:
-
-$$\begin{aligned}
-    \Expval{A_1 B_1} - \Expval{A_1 B_2}
-    &= \int \rho(\lambda) \Big( A_1 B_1 - A_1 B_2 \pm A_1 B_1 A_2 B_2 \mp A_1 B_1 A_2 B_2 \Big) \dd{\lambda}
-    \\
-    &= \int \rho(\lambda) A_1 B_1 \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
-    - \!\int \rho(\lambda) A_1 B_2 \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda}
-\end{aligned}$$
-
-Taking the absolute value of both sides
-and invoking the triangle inequality then yields:
-
-$$\begin{aligned}
-    \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
-    &= \bigg|\! \int \rho(\lambda) A_1 B_1 \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
-    - \!\int \rho(\lambda) A_1 B_2 \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda} \!\bigg|
-    \\
-    &\le \bigg|\! \int \rho(\lambda) A_1 B_1 \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda} \!\bigg|
-    + \bigg|\! \int \rho(\lambda) A_1 B_2 \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda} \!\bigg|
-\end{aligned}$$
-
-Using the fact that the product of $$A$$ and $$B$$ is always either $$-1$$ or $$+1$$,
-we can reduce this to:
-
-$$\begin{aligned}
-    \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
-    &\le \int \rho(\lambda) \Big| A_1 B_1 \Big| \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
-    + \!\int \rho(\lambda) \Big| A_1 B_2 \Big| \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda}
-    \\
-    &\le \int \rho(\lambda) \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
-    + \!\int \rho(\lambda) \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda}
-\end{aligned}$$
-
-Evaluating these integrals gives us the following inequality,
-which holds for both choices of $$\pm$$:
-
-$$\begin{aligned}
-    \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
-    &\le 2 \pm \Expval{A_2 B_2} \pm \Expval{A_2 B_1}
-\end{aligned}$$
-
-We should choose the signs such that the right-hand side is as small as possible, that is:
-
-$$\begin{aligned}
-    \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
-    &\le 2 \pm \Big( \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big)
-    \\
-    &\le 2 - \Big| \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big|
-\end{aligned}$$
-
-Rearranging this and once again using the triangle inequality,
-we get the CHSH inequality:
-
-$$\begin{aligned}
-    2
-    &\ge \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big| + \Big| \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big|
-    \\
-    &\ge \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} + \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big|
-\end{aligned}$$
-
-The quantity on the right-hand side is sometimes called the **CHSH quantity** $$S$$,
-and measures the correlation between the spins of $$A$$ and $$B$$:
-
-$$\begin{aligned}
-    \boxed{
-        S \equiv \Expval{A_2 B_1} + \Expval{A_2 B_2} + \Expval{A_1 B_1} - \Expval{A_1 B_2}
-    }
-\end{aligned}$$
-
-The CHSH inequality places an upper bound on the magnitude of $$S$$
-for LHV-based theories:
-
-$$\begin{aligned}
-    \boxed{
-        |S| \le 2
-    }
-\end{aligned}$$
-
-
-## Tsirelson's bound
-
-Quantum physics can violate the CHSH inequality, but by how much?
-Consider the following two-particle operator,
-whose expectation value is the CHSH quantity, i.e. $$S = \expval{\hat{S}}$$:
-
-$$\begin{aligned}
-    \hat{S}
-    = \hat{A}_2 \otimes \hat{B}_1 + \hat{A}_2 \otimes \hat{B}_2 + \hat{A}_1 \otimes \hat{B}_1 - \hat{A}_1 \otimes \hat{B}_2
-\end{aligned}$$
-
-Where $$\otimes$$ is the tensor product,
-and e.g. $$\hat{A}_1$$ is the Pauli matrix for the $$\vec{a}_1$$-direction.
-The square of this operator is then given by:
-
-$$\begin{aligned}
-    \hat{S}^2
-    = \quad &\hat{A}_2^2 \otimes \hat{B}_1^2 + \hat{A}_2^2 \otimes \hat{B}_1 \hat{B}_2
-    + \hat{A}_2 \hat{A}_1 \otimes \hat{B}_1^2 - \hat{A}_2 \hat{A}_1 \otimes \hat{B}_1 \hat{B}_2
-    \\
-    + &\hat{A}_2^2 \otimes \hat{B}_2 \hat{B}_1 + \hat{A}_2^2 \otimes \hat{B}_2^2
-    + \hat{A}_2 \hat{A}_1 \otimes \hat{B}_2 \hat{B}_1 - \hat{A}_2 \hat{A}_1 \otimes \hat{B}_2^2
-    \\
-    + &\hat{A}_1 \hat{A}_2 \otimes \hat{B}_1^2 + \hat{A}_1 \hat{A}_2 \otimes \hat{B}_1 \hat{B}_2
-    + \hat{A}_1^2 \otimes \hat{B}_1^2 - \hat{A}_1^2 \otimes \hat{B}_1 \hat{B}_2
-    \\
-    - &\hat{A}_1 \hat{A}_2 \otimes \hat{B}_2 \hat{B}_1 - \hat{A}_1 \hat{A}_2 \otimes \hat{B}_2^2
-    - \hat{A}_1^2 \otimes \hat{B}_2 \hat{B}_1 + \hat{A}_1^2 \otimes \hat{B}_2^2
-    \\
-    = \quad &\hat{A}_2^2 \otimes \hat{B}_1^2 + \hat{A}_2^2 \otimes \hat{B}_2^2 + \hat{A}_1^2 \otimes \hat{B}_1^2 + \hat{A}_1^2 \otimes \hat{B}_2^2
-    \\
-    + &\hat{A}_2^2 \otimes \acomm{\hat{B}_1}{\hat{B}_2} - \hat{A}_1^2 \otimes \acomm{\hat{B}_1}{\hat{B}_2}
-    + \acomm{\hat{A}_1}{\hat{A}_2} \otimes \hat{B}_1^2 - \acomm{\hat{A}_1}{\hat{A}_2} \otimes \hat{B}_2^2
-    \\
-    + &\hat{A}_1 \hat{A}_2 \otimes \comm{\hat{B}_1}{\hat{B}_2} - \hat{A}_2 \hat{A}_1 \otimes \comm{\hat{B}_1}{\hat{B}_2}
-\end{aligned}$$
-
-Spin operators are unitary, so their square is the identity,
-e.g. $$\hat{A}_1^2 = \hat{I}$$. Therefore $$\hat{S}^2$$ reduces to:
-
-$$\begin{aligned}
-    \hat{S}^2
-    &= 4 \: (\hat{I} \otimes \hat{I}) + \comm{\hat{A}_1}{\hat{A}_2} \otimes \comm{\hat{B}_1}{\hat{B}_2}
-\end{aligned}$$
-
-The *norm* $$\norm{\hat{S}^2}$$ of this operator
-is the largest possible expectation value $$\expval{\hat{S}^2}$$,
-which is the same as its largest eigenvalue.
-It is given by:
-
-$$\begin{aligned}
-    \Norm{\hat{S}^2}
-    &= 4 + \Norm{\comm{\hat{A}_1}{\hat{A}_2} \otimes \comm{\hat{B}_1}{\hat{B}_2}}
-    \\
-    &\le 4 + \Norm{\comm{\hat{A}_1}{\hat{A}_2}} \Norm{\comm{\hat{B}_1}{\hat{B}_2}}
-\end{aligned}$$
-
-We find a bound for the norm of the commutators by using the triangle inequality, such that:
-
-$$\begin{aligned}
-    \Norm{\comm{\hat{A}_1}{\hat{A}_2}}
-    = \Norm{\hat{A}_1 \hat{A}_2 - \hat{A}_2 \hat{A}_1}
-    \le \Norm{\hat{A}_1 \hat{A}_2} + \Norm{\hat{A}_2 \hat{A}_1}
-    \le 2 \Norm{\hat{A}_1 \hat{A}_2}
-    \le 2
-\end{aligned}$$
-
-And $$\norm{\comm{\hat{B}_1}{\hat{B}_2}} \le 2$$ for the same reason.
-The norm is the largest eigenvalue, therefore:
-
-$$\begin{aligned}
-    \Norm{\hat{S}^2}
-    \le 4 + 2 \cdot 2
-    = 8
-    \quad \implies \quad
-    \Norm{\hat{S}}
-    \le \sqrt{8}
-    = 2 \sqrt{2}
-\end{aligned}$$
-
-We thus arrive at **Tsirelson's bound**,
-which states that quantum mechanics can violate
-the CHSH inequality by a factor of $$\sqrt{2}$$:
-
-$$\begin{aligned}
-    \boxed{
-        |S|
-        \le 2 \sqrt{2}
-    }
-\end{aligned}$$
-
-Importantly, this is a *tight* bound,
-meaning that there exist certain spin measurement directions
-for which Tsirelson's bound becomes an equality, for example:
-
-$$\begin{aligned}
-    \hat{A}_1 = \hat{\sigma}_z
-    \qquad
-    \hat{A}_2 = \hat{\sigma}_x
-    \qquad
-    \hat{B}_1 = \frac{\hat{\sigma}_z + \hat{\sigma}_x}{\sqrt{2}}
-    \qquad
-    \hat{B}_2 = \frac{\hat{\sigma}_z - \hat{\sigma}_x}{\sqrt{2}}
-\end{aligned}$$
-
-Using the fact that $$\Expval{A_a B_b} = - \vec{a} \cdot \vec{b}$$,
-it can then be shown that $$S = 2 \sqrt{2}$$ in this case.
-
-
 
 ## References
 1.  D.J. Griffiths, D.F. Schroeter,
diff --git a/source/know/concept/chsh-inequality/index.md b/source/know/concept/chsh-inequality/index.md
new file mode 100644
index 0000000..984bae6
--- /dev/null
+++ b/source/know/concept/chsh-inequality/index.md
@@ -0,0 +1,273 @@
+---
+title: "CHSH inequality"
+sort_title: "CHSH inequality"
+date: 2023-02-05
+categories:
+- Physics
+- Quantum mechanics
+- Quantum information
+layout: "concept"
+---
+
+The **Clauser-Horne-Shimony-Holt (CHSH) inequality**
+is an alternative proof of [Bell's theorem](/know/concept/bells-theorem/),
+which takes a slightly different approach
+and is more useful in practice.
+
+Suppose there is a local hidden variable (LHV) $$\lambda$$
+with an unknown probability density $$\rho$$:
+
+$$\begin{aligned}
+    \int \rho(\lambda) \dd{\lambda} = 1
+    \qquad \quad
+    \rho(\lambda) \ge 0
+\end{aligned}$$
+
+Given two spin-1/2 particles $$A$$ and $$B$$,
+measuring their spins along arbitrary directions $$\vec{a}$$ and $$\vec{b}$$
+would give each an eigenvalue $$\pm 1$$. We write this as:
+
+$$\begin{aligned}
+    A(\vec{a}, \lambda) = \pm 1
+    \qquad \quad
+    B(\vec{b}, \lambda) = \pm 1
+\end{aligned}$$
+
+If $$A$$ and $$B$$ start in an entangled [Bell state](/know/concept/bell-state/),
+e.g. $$\ket{\Psi^{-}}$$, then we expect a correlation between their measurements results.
+The product of the outcomes of $$A$$ and $$B$$ is:
+
+$$\begin{aligned}
+    \Expval{A_a B_b}
+    \equiv \int \rho(\lambda) \: A(\vec{a}, \lambda) \: B(\vec{b}, \lambda) \dd{\lambda}
+\end{aligned}$$
+
+So far, we have taken the same path as for proving Bell's inequality,
+but for the CHSH inequality we must now diverge.
+
+
+
+## Deriving the inequality
+
+Consider four spin directions, two for $$A$$ called $$\vec{a}_1$$ and $$\vec{a}_2$$,
+and two for $$B$$ called $$\vec{b}_1$$ and $$\vec{b}_2$$.
+Let us introduce the following abbreviations:
+
+$$\begin{aligned}
+    A_1 \equiv A(\vec{a}_1, \lambda)
+    \qquad \quad
+    A_2 \equiv A(\vec{a}_2, \lambda)
+    \qquad \quad
+    B_1 \equiv B(\vec{b}_1, \lambda)
+    \qquad \quad
+    B_2 \equiv B(\vec{b}_2, \lambda)
+\end{aligned}$$
+
+From the definition of the expectation value,
+we know that the difference is given by:
+
+$$\begin{aligned}
+    \Expval{A_1 B_1} - \Expval{A_1 B_2}
+    = \int \rho(\lambda) \Big( A_1 B_1 - A_1 B_2 \Big) \dd{\lambda}
+\end{aligned}$$
+
+We introduce some new terms and rearrange the resulting expression:
+
+$$\begin{aligned}
+    \Expval{A_1 B_1} - \Expval{A_1 B_2}
+    &= \int \rho(\lambda) \Big( A_1 B_1 - A_1 B_2 \pm A_1 B_1 A_2 B_2 \mp A_1 B_1 A_2 B_2 \Big) \dd{\lambda}
+    \\
+    &= \int \rho(\lambda) A_1 B_1 \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
+    - \!\int \rho(\lambda) A_1 B_2 \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda}
+\end{aligned}$$
+
+Taking the absolute value of both sides
+and invoking the triangle inequality then yields:
+
+$$\begin{aligned}
+    \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
+    &= \bigg|\! \int \rho(\lambda) A_1 B_1 \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
+    - \!\int \rho(\lambda) A_1 B_2 \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda} \!\bigg|
+    \\
+    &\le \bigg|\! \int \rho(\lambda) A_1 B_1 \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda} \!\bigg|
+    + \bigg|\! \int \rho(\lambda) A_1 B_2 \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda} \!\bigg|
+\end{aligned}$$
+
+Using the fact that the product of the spin eigenvalues of $$A$$ and $$B$$
+is always either $$-1$$ or $$+1$$ for all directions,
+we can reduce this to:
+
+$$\begin{aligned}
+    \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
+    &\le \int \rho(\lambda) \Big| A_1 B_1 \Big| \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
+    + \!\int \rho(\lambda) \Big| A_1 B_2 \Big| \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda}
+    \\
+    &\le \int \rho(\lambda) \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
+    + \!\int \rho(\lambda) \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda}
+\end{aligned}$$
+
+Evaluating these integrals gives us the following inequality,
+which holds for both choices of $$\pm$$:
+
+$$\begin{aligned}
+    \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
+    &\le 2 \pm \Expval{A_2 B_2} \pm \Expval{A_2 B_1}
+\end{aligned}$$
+
+We should choose the signs such that the right-hand side is as small as possible, that is:
+
+$$\begin{aligned}
+    \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
+    &\le 2 \pm \Big( \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big)
+    \\
+    &\le 2 - \Big| \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big|
+\end{aligned}$$
+
+Rearranging this and once again using the triangle inequality,
+we get the CHSH inequality:
+
+$$\begin{aligned}
+    2
+    &\ge \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big| + \Big| \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big|
+    \\
+    &\ge \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} + \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big|
+\end{aligned}$$
+
+The quantity on the right-hand side is sometimes called the **CHSH quantity** $$S$$,
+and measures the correlation between the spins of $$A$$ and $$B$$:
+
+$$\begin{aligned}
+    \boxed{
+        S \equiv \Expval{A_2 B_1} + \Expval{A_2 B_2} + \Expval{A_1 B_1} - \Expval{A_1 B_2}
+    }
+\end{aligned}$$
+
+The CHSH inequality places an upper bound on the magnitude of $$S$$
+for LHV-based theories:
+
+$$\begin{aligned}
+    \boxed{
+        |S| \le 2
+    }
+\end{aligned}$$
+
+
+
+## Tsirelson's bound
+
+Quantum physics can violate the CHSH inequality, but by how much?
+Consider the following two-particle operator,
+whose expectation value is the CHSH quantity, i.e. $$S = \expval{\hat{S}}$$:
+
+$$\begin{aligned}
+    \hat{S}
+    = \hat{A}_2 \otimes \hat{B}_1 + \hat{A}_2 \otimes \hat{B}_2 + \hat{A}_1 \otimes \hat{B}_1 - \hat{A}_1 \otimes \hat{B}_2
+\end{aligned}$$
+
+Where $$\otimes$$ is the tensor product,
+and e.g. $$\hat{A}_1$$ is the Pauli matrix for the $$\vec{a}_1$$-direction.
+The square of this operator is then given by:
+
+$$\begin{aligned}
+    \hat{S}^2
+    = \quad &\hat{A}_2^2 \otimes \hat{B}_1^2 + \hat{A}_2^2 \otimes \hat{B}_1 \hat{B}_2
+    + \hat{A}_2 \hat{A}_1 \otimes \hat{B}_1^2 - \hat{A}_2 \hat{A}_1 \otimes \hat{B}_1 \hat{B}_2
+    \\
+    + &\hat{A}_2^2 \otimes \hat{B}_2 \hat{B}_1 + \hat{A}_2^2 \otimes \hat{B}_2^2
+    + \hat{A}_2 \hat{A}_1 \otimes \hat{B}_2 \hat{B}_1 - \hat{A}_2 \hat{A}_1 \otimes \hat{B}_2^2
+    \\
+    + &\hat{A}_1 \hat{A}_2 \otimes \hat{B}_1^2 + \hat{A}_1 \hat{A}_2 \otimes \hat{B}_1 \hat{B}_2
+    + \hat{A}_1^2 \otimes \hat{B}_1^2 - \hat{A}_1^2 \otimes \hat{B}_1 \hat{B}_2
+    \\
+    - &\hat{A}_1 \hat{A}_2 \otimes \hat{B}_2 \hat{B}_1 - \hat{A}_1 \hat{A}_2 \otimes \hat{B}_2^2
+    - \hat{A}_1^2 \otimes \hat{B}_2 \hat{B}_1 + \hat{A}_1^2 \otimes \hat{B}_2^2
+    \\
+    = \quad &\hat{A}_2^2 \otimes \hat{B}_1^2 + \hat{A}_2^2 \otimes \hat{B}_2^2 + \hat{A}_1^2 \otimes \hat{B}_1^2 + \hat{A}_1^2 \otimes \hat{B}_2^2
+    \\
+    + &\hat{A}_2^2 \otimes \acomm{\hat{B}_1}{\hat{B}_2} - \hat{A}_1^2 \otimes \acomm{\hat{B}_1}{\hat{B}_2}
+    + \acomm{\hat{A}_1}{\hat{A}_2} \otimes \hat{B}_1^2 - \acomm{\hat{A}_1}{\hat{A}_2} \otimes \hat{B}_2^2
+    \\
+    + &\hat{A}_1 \hat{A}_2 \otimes \comm{\hat{B}_1}{\hat{B}_2} - \hat{A}_2 \hat{A}_1 \otimes \comm{\hat{B}_1}{\hat{B}_2}
+\end{aligned}$$
+
+Spin operators are unitary, so their square is the identity,
+e.g. $$\hat{A}_1^2 = \hat{I}$$. Therefore $$\hat{S}^2$$ reduces to:
+
+$$\begin{aligned}
+    \hat{S}^2
+    &= 4 \: (\hat{I} \otimes \hat{I}) + \comm{\hat{A}_1}{\hat{A}_2} \otimes \comm{\hat{B}_1}{\hat{B}_2}
+\end{aligned}$$
+
+The *norm* $$\norm{\hat{S}^2}$$ of this operator
+is the largest possible expectation value $$\expval{\hat{S}^2}$$,
+which is the same as its largest eigenvalue.
+It is given by:
+
+$$\begin{aligned}
+    \Norm{\hat{S}^2}
+    &= 4 + \Norm{\comm{\hat{A}_1}{\hat{A}_2} \otimes \comm{\hat{B}_1}{\hat{B}_2}}
+    \\
+    &\le 4 + \Norm{\comm{\hat{A}_1}{\hat{A}_2}} \Norm{\comm{\hat{B}_1}{\hat{B}_2}}
+\end{aligned}$$
+
+We find a bound for the norm of the commutators by using the triangle inequality, such that:
+
+$$\begin{aligned}
+    \Norm{\comm{\hat{A}_1}{\hat{A}_2}}
+    = \Norm{\hat{A}_1 \hat{A}_2 - \hat{A}_2 \hat{A}_1}
+    \le \Norm{\hat{A}_1 \hat{A}_2} + \Norm{\hat{A}_2 \hat{A}_1}
+    \le 2 \Norm{\hat{A}_1 \hat{A}_2}
+    \le 2
+\end{aligned}$$
+
+And $$\norm{\comm{\hat{B}_1}{\hat{B}_2}} \le 2$$ for the same reason.
+The norm is the largest eigenvalue, therefore:
+
+$$\begin{aligned}
+    \Norm{\hat{S}^2}
+    \le 4 + 2 \cdot 2
+    = 8
+    \quad \implies \quad
+    \Norm{\hat{S}}
+    \le \sqrt{8}
+    = 2 \sqrt{2}
+\end{aligned}$$
+
+We thus arrive at **Tsirelson's bound**,
+which states that quantum mechanics can violate
+the CHSH inequality by a factor of $$\sqrt{2}$$:
+
+$$\begin{aligned}
+    \boxed{
+        |S|
+        \le 2 \sqrt{2}
+    }
+\end{aligned}$$
+
+Importantly, this is a *tight* bound,
+meaning that there exist certain spin measurement directions
+for which Tsirelson's bound becomes an equality, for example:
+
+$$\begin{aligned}
+    \hat{A}_1 = \hat{\sigma}_z
+    \qquad
+    \hat{A}_2 = \hat{\sigma}_x
+    \qquad
+    \hat{B}_1 = \frac{\hat{\sigma}_z + \hat{\sigma}_x}{\sqrt{2}}
+    \qquad
+    \hat{B}_2 = \frac{\hat{\sigma}_z - \hat{\sigma}_x}{\sqrt{2}}
+\end{aligned}$$
+
+Fundamental quantum mechanics says that
+$$\Expval{A_a B_b} = - \vec{a} \cdot \vec{b}$$,
+so $$S = 2 \sqrt{2}$$ in this case.
+
+
+
+## References
+1.  D.J. Griffiths, D.F. Schroeter,
+    *Introduction to quantum mechanics*, 3rd edition,
+    Cambridge.
+2.  J.B. Brask,
+    *Quantum information: lecture notes*,
+    2021, unpublished.
diff --git a/source/know/concept/lagrange-multiplier/index.md b/source/know/concept/lagrange-multiplier/index.md
index ce5418f..9fb61a8 100644
--- a/source/know/concept/lagrange-multiplier/index.md
+++ b/source/know/concept/lagrange-multiplier/index.md
@@ -102,7 +102,7 @@ by demanding it is stationary:
 
 $$\begin{aligned}
     0
-    = \nabla \mathcal{L}
+    = \nabla' \mathcal{L}
     &= \bigg( \pdv{\mathcal{L}}{x}, \pdv{\mathcal{L}}{y}, \pdv{\mathcal{L}}{\lambda} \bigg)
     \\
     &= \bigg( \pdv{f}{x} + \lambda \pdv{g}{x}, \:\:\: \pdv{f}{y} + \lambda \pdv{g}{y}, \:\:\: g \bigg)
diff --git a/source/know/concept/pulay-mixing/index.md b/source/know/concept/pulay-mixing/index.md
index 6e809dd..81051f1 100644
--- a/source/know/concept/pulay-mixing/index.md
+++ b/source/know/concept/pulay-mixing/index.md
@@ -8,68 +8,70 @@ layout: "concept"
 ---
 
 Some numerical problems are most easily solved *iteratively*,
-by generating a series $$\rho_1$$, $$\rho_2$$, etc.
-converging towards the desired solution $$\rho_*$$.
+by generating a series of "solutions" $$f_1$$, $$f_2$$, etc.
+converging towards the true $$f_\infty$$.
 **Pulay mixing**, also often called
 **direct inversion in the iterative subspace** (DIIS),
 can speed up the convergence for some types of problems,
 and also helps to avoid periodic divergences.
 
 The key concept it relies on is the **residual vector** $$R_n$$
-of the $$n$$th iteration, which in some way measures the error of the current $$\rho_n$$.
-Its exact definition varies,
-but is generally along the lines of the difference between
-the input of the iteration and the raw resulting output:
+of the $$n$$th iteration, which measures the error of the current $$f_n$$.
+Its exact definition can vary,
+but it is generally the difference between
+the input $$f_n$$ of the $$n$$th iteration
+and the raw resulting output $$f_n^\mathrm{new}$$:
 
 $$\begin{aligned}
 	R_n
-	= R[\rho_n]
-	= \rho_n^\mathrm{new}[\rho_n] - \rho_n
+	\equiv R[f_n]
+	\equiv f_n^\mathrm{new}[f_n] - f_n
 \end{aligned}$$
 
-It is not always clear what to do with $$\rho_n^\mathrm{new}$$.
-Directly using it as the next input ($$\rho_{n+1} = \rho_n^\mathrm{new}$$)
+It is not always clear what to do with $$f_n^\mathrm{new}$$.
+Directly using it as the next input ($$f_{n+1} = f_n^\mathrm{new}$$)
 often leads to oscillation,
-and linear mixing ($$\rho_{n+1} = (1\!-\!f) \rho_n + f \rho_n^\mathrm{new}$$)
+and linear mixing ($$f_{n+1} = (1\!-\!c) f_n + c f_n^\mathrm{new}$$)
 can take a very long time to converge properly.
 Pulay mixing offers an improvement.
 
-The idea is to construct the next iteration's input $$\rho_{n+1}$$
-as a linear combination of the previous inputs $$\rho_1$$, $$\rho_2$$, ..., $$\rho_n$$,
-such that it is as close as possible to the optimal $$\rho_*$$:
+The idea is to construct the next iteration's input $$f_{n+1}$$
+as a linear combination of the previous inputs $$f_1$$, $$f_2$$, ..., $$f_n$$,
+such that it is as close as possible to the optimal $$f_\infty$$:
 
 $$\begin{aligned}
 	\boxed{
-		\rho_{n+1}
-		= \sum_{m = 1}^n \alpha_m \rho_m
+		f_{n+1}
+		= \sum_{m = 1}^n \alpha_m f_m
 	}
 \end{aligned}$$
 
 To do so, we make two assumptions.
-Firstly, the current $$\rho_n$$ is already close to $$\rho_*$$,
-so that such a linear combination makes sense.
-Secondly, the iteration is linear,
-such that the raw output $$\rho_{n+1}^\mathrm{new}$$
-is also a linear combination with the *same coefficients*:
+First, that the current $$f_n$$ is already close to $$f_\infty$$,
+so such a linear combination makes sense.
+Second, that the iteration is linear,
+so the raw output $$f_{n+1}^\mathrm{new}$$
+is also a linear combination *with the same coefficients*:
 
 $$\begin{aligned}
-	\rho_{n+1}^\mathrm{new}
-	= \sum_{m = 1}^n \alpha_m \rho_m^\mathrm{new}
+	f_{n+1}^\mathrm{new}
+	= \sum_{m = 1}^n \alpha_m f_m^\mathrm{new}
 \end{aligned}$$
 
-We will return to these assumptions later.
-The point is that $$R_{n+1}$$ is also a linear combination:
+We will revisit these assumptions later.
+The point is that $$R_{n+1}$$ can now also be written
+as a linear combination of old residuals $$R_m$$:
 
 $$\begin{aligned}
 	R_{n+1}
-	= \rho_{n+1}^\mathrm{new} - \rho_{n+1}
-	= \sum_{m = 1}^n \alpha_m \rho_m^\mathrm{new} - \sum_{m = 1}^n \alpha_m \rho_m
+	= f_{n+1}^\mathrm{new} - f_{n+1}
+	= \sum_{m = 1}^n \alpha_m f_m^\mathrm{new} - \sum_{m = 1}^n \alpha_m f_m
 	= \sum_{m = 1}^n \alpha_m R_m
 \end{aligned}$$
 
 The goal is to choose the coefficients $$\alpha_m$$ such that
 the norm of the error $$|R_{n+1}| \approx 0$$,
-subject to the following constraint to preserve the normalization of $$\rho_{n+1}$$:
+subject to the following constraint to preserve the normalization of $$f_{n+1}$$:
 
 $$\begin{aligned}
 	\sum_{m=1}^n \alpha_m = 1
@@ -79,20 +81,19 @@ We thus want to minimize the following quantity,
 where $$\lambda$$ is a [Lagrange multiplier](/know/concept/lagrange-multiplier/):
 
 $$\begin{aligned}
-	\Inprod{R_{n+1}}{R_{n+1}} + \lambda \sum_{m = 1}^n \alpha_m^*
-	= \sum_{m=1}^n \alpha_m^* \Big( \sum_{k=1}^n \alpha_k \Inprod{R_m}{R_k} + \lambda \Big)
+	\inprod{R_{n+1}}{R_{n+1}} + \lambda \sum_{m = 1}^n \alpha_m^*
+	= \sum_{m=1}^n \alpha_m^* \Big( \sum_{k=1}^n \alpha_k \inprod{R_m}{R_k} + \lambda \Big)
 \end{aligned}$$
 
 By differentiating the right-hand side with respect to $$\alpha_m^*$$
 and demanding that the result is zero,
-we get a system of equations that we can write in matrix form,
-which is cheap to solve:
+we get a cheap-to-solve system of equations, in matrix form:
 
 $$\begin{aligned}
 	\begin{bmatrix}
-		\Inprod{R_1}{R_1} & \cdots & \Inprod{R_1}{R_n} & 1 \\
+		\inprod{R_1}{R_1} & \cdots & \inprod{R_1}{R_n} & 1 \\
 		\vdots & \ddots & \vdots & \vdots \\
-		\Inprod{R_n}{R_1} & \cdots & \Inprod{R_n}{R_n} & 1 \\
+		\inprod{R_n}{R_1} & \cdots & \inprod{R_n}{R_n} & 1 \\
 		1 & \cdots & 1 & 0
 	\end{bmatrix}
 	\cdot
@@ -106,48 +107,50 @@ $$\begin{aligned}
 \end{aligned}$$
 
 From this, we can also see that the Lagrange multiplier
-$$\lambda = - \Inprod{R_{n+1}}{R_{n+1}}$$,
+$$\lambda = - \inprod{R_{n+1}}{R_{n+1}}$$,
 where $$R_{n+1}$$ is the *predicted* residual of the next iteration,
 subject to the two assumptions.
+This fact makes $$\lambda$$ a useful measure of convergence.
 
-However, in practice, the earlier inputs $$\rho_1$$, $$\rho_2$$, etc.
-are much further from $$\rho_*$$ than $$\rho_n$$,
-so usually only the most recent $$N\!+\!1$$ inputs $$\rho_{n - N}$$, ..., $$\rho_n$$ are used:
+In practice, the earlier inputs $$f_1$$, $$f_2$$, etc.
+are much further from $$f_\infty$$ than $$f_n$$,
+so usually only the most recent $$N\!+\!1$$ inputs $$f_{n - N}, ..., f_n$$ are used.
+This also keeps the matrix small:
 
 $$\begin{aligned}
-	\rho_{n+1}
-	= \sum_{m = n-N}^n \alpha_m \rho_m
+	f_{n+1}
+	= \sum_{m = n-N}^n \alpha_m f_m
 \end{aligned}$$
 
-You might be confused by the absence of any $$\rho_m^\mathrm{new}$$
-in the creation of $$\rho_{n+1}$$, as if the iteration's outputs are being ignored.
+You might be confused by the absence of any $$f_m^\mathrm{new}$$
+in the creation of $$f_{n+1}$$, as if the iteration's outputs are being ignored.
 This is due to the first assumption,
-which states that $$\rho_n^\mathrm{new}$$ and $$\rho_n$$ are already similar,
+which states that $$f_n^\mathrm{new}$$ and $$f_n$$ are already similar,
 such that they are basically interchangeable.
 
-Speaking of which, about those assumptions:
-while they will clearly become more accurate as $$\rho_n$$ approaches $$\rho_*$$,
-they might be very dubious in the beginning.
-A consequence of this is that the early iterations might get "trapped"
-in a suboptimal subspace spanned by $$\rho_1$$, $$\rho_2$$, etc.
-To say it another way, we would be varying $$n$$ coefficients $$\alpha_m$$
-to try to optimize a $$D$$-dimensional $$\rho_{n+1}$$,
-where in general $$D \gg n$$, at least in the beginning.
-
-There is an easy fix to this problem:
-add a small amount of the raw residual $$R_m$$
-to "nudge" $$\rho_{n+1}$$ towards the right subspace,
+Although those assumptions will clearly become more realistic as $$f_n \to f_\infty$$,
+they might be very dubious at first.
+Consequently, the early iterations may get "trapped"
+in a suboptimal subspace spanned by $$f_1$$, $$f_2$$, etc.
+Think of it like this:
+we would be varying up to $$n$$ coefficients $$\alpha_m$$
+to try to optimize a $$D$$-dimensional $$f_{n+1}$$, where usually $$D \gg n$$.
+It is almost impossible to find a decent optimum in this way!
+
+This problem is easy to fix,
+by mixing in a small amount of the raw residuals $$R_m$$
+to "nudge" $$f_{n+1}$$ towards the right subspace,
 where $$\beta \in [0,1]$$ is a tunable parameter:
 
 $$\begin{aligned}
 	\boxed{
-		\rho_{n+1}
-		= \sum_{m = N}^n \alpha_m (\rho_m + \beta R_m)
+		f_{n+1}
+		= \sum_{m = N}^n \alpha_m (f_m + \beta R_m)
 	}
 \end{aligned}$$
 
-In other words, we end up introducing a small amount of the raw outputs $$\rho_m^\mathrm{new}$$,
-while still giving more weight to iterations with smaller residuals.
+In this way, the raw outputs $$f_m^\mathrm{new}$$ are (rightfully) included via $$R_m$$,
+but we still give more weight to iterations with smaller residuals.
 
 Pulay mixing is very effective for certain types of problems,
 e.g. density functional theory,
author	Prefetch	2023-02-20 18:08:31 +0100
committer	Prefetch	2023-02-20 18:08:31 +0100
commit	75636ed8772512bdf38e3dec431888837eaddc5d (patch)
tree	4beef6d770f1745b3667b3f9816987a29a79d42a
parent	7f65c526132ee98d59d1a2b53d08c4b49330af03 (diff)