summaryrefslogtreecommitdiff
path: root/source/know/concept
diff options
context:
space:
mode:
Diffstat (limited to 'source/know/concept')
-rw-r--r--source/know/concept/bells-theorem/index.md266
-rw-r--r--source/know/concept/chsh-inequality/index.md273
-rw-r--r--source/know/concept/lagrange-multiplier/index.md2
-rw-r--r--source/know/concept/pulay-mixing/index.md121
4 files changed, 365 insertions, 297 deletions
diff --git a/source/know/concept/bells-theorem/index.md b/source/know/concept/bells-theorem/index.md
index a01bf9e..1589a7a 100644
--- a/source/know/concept/bells-theorem/index.md
+++ b/source/know/concept/bells-theorem/index.md
@@ -17,13 +17,13 @@ Suppose that we have two spin-1/2 particles, called $$A$$ and $$B$$,
in an entangled [Bell state](/know/concept/bell-state/):
$$\begin{aligned}
- \Ket{\Psi^{-}}
- = \frac{1}{\sqrt{2}} \Big( \Ket{\uparrow \downarrow} - \Ket{\downarrow \uparrow} \Big)
+ \ket{\Psi^{-}}
+ = \frac{1}{\sqrt{2}} \Big( \ket{\uparrow \downarrow} - \ket{\downarrow \uparrow} \Big)
\end{aligned}$$
Since they are entangled,
-if we measure the $$z$$-spin of particle $$A$$, and find e.g. $$\Ket{\uparrow}$$,
-then particle $$B$$ immediately takes the opposite state $$\Ket{\downarrow}$$.
+if we measure the $$z$$-spin of particle $$A$$, and find e.g. $$\ket{\uparrow}$$,
+then particle $$B$$ immediately takes the opposite state $$\ket{\downarrow}$$.
The point is that this collapse is instant,
regardless of the distance between $$A$$ and $$B$$.
@@ -69,21 +69,29 @@ $$\begin{aligned}
\end{aligned}$$
The product of the outcomes of $$A$$ and $$B$$ then has the following expectation value.
-Note that we only multiply $$A$$ and $$B$$ for shared $$\lambda$$-values:
-this is what makes it a **local** hidden variable:
+Note that we multiply $$A$$ and $$B$$ at the same $$\lambda$$-value,
+hence it is a *local* hidden variable:
$$\begin{aligned}
- \Expval{A_a B_b}
- = \int \rho(\lambda) \: A(\vec{a}, \lambda) \: B(\vec{b}, \lambda) \dd{\lambda}
+ \expval{A_a B_b}
+ \equiv \int \rho(\lambda) \: A(\vec{a}, \lambda) \: B(\vec{b}, \lambda) \dd{\lambda}
\end{aligned}$$
-From this, two inequalities can be derived,
-which both prove Bell's theorem.
+From this, we can make several predictions about LHV theories,
+which turn out to disagree with various theoretical
+and experimental results in quantum mechanics.
+The two most famous LHV predictions are
+the **Bell inequality** and
+the [CHSH inequality](/know/concept/chsh-inequality/).
+
## Bell inequality
-If $$\vec{a} = \vec{b}$$, then we know that $$A$$ and $$B$$ always have opposite spins:
+We present Bell's original proof of his theorem.
+If $$\vec{a} = \vec{b}$$, then we know that
+measuring $$A$$ and $$B$$ gives them opposite spins,
+because they start in the entangled state $$\ket{\Psi^{-}}$$:
$$\begin{aligned}
A(\vec{a}, \lambda)
@@ -94,7 +102,7 @@ $$\begin{aligned}
The expectation value of the product can therefore be rewritten as follows:
$$\begin{aligned}
- \Expval{A_a B_b}
+ \expval{A_a B_b}
= - \int \rho(\lambda) \: A(\vec{a}, \lambda) \: A(\vec{b}, \lambda) \dd{\lambda}
\end{aligned}$$
@@ -102,7 +110,7 @@ Next, we introduce an arbitrary third direction $$\vec{c}$$,
and use the fact that $$( A(\vec{b}, \lambda) )^2 = 1$$:
$$\begin{aligned}
- \Expval{A_a B_b} - \Expval{A_a B_c}
+ \expval{A_a B_b} - \expval{A_a B_c}
&= - \int \rho(\lambda) \Big( A(\vec{a}, \lambda) \: A(\vec{b}, \lambda) - A(\vec{a}, \lambda) \: A(\vec{c}, \lambda) \Big) \dd{\lambda}
\\
&= - \int \rho(\lambda) \Big( 1 - A(\vec{b}, \lambda) \: A(\vec{c}, \lambda) \Big) A(\vec{a}, \lambda) \: A(\vec{b}, \lambda) \dd{\lambda}
@@ -114,7 +122,7 @@ Taking the absolute value of the whole left,
and of the integrand on the right, we thus get:
$$\begin{aligned}
- \Big| \Expval{A_a B_b} - \Expval{A_a B_c} \Big|
+ \Big| \expval{A_a B_b} - \expval{A_a B_c} \Big|
&\le \int \rho(\lambda) \Big( 1 - A(\vec{b}, \lambda) \: A(\vec{c}, \lambda) \Big)
\: \Big| A(\vec{a}, \lambda) \: A(\vec{b}, \lambda) \Big| \dd{\lambda}
\\
@@ -122,24 +130,24 @@ $$\begin{aligned}
\end{aligned}$$
Since $$\rho(\lambda)$$ is a normalized probability density function,
-we arrive at the **Bell inequality**:
+we arrive at the Bell inequality:
$$\begin{aligned}
\boxed{
- \Big| \Expval{A_a B_b} - \Expval{A_a B_c} \Big|
- \le 1 + \Expval{A_b B_c}
+ \Big| \expval{A_a B_b} - \expval{A_a B_c} \Big|
+ \le 1 + \expval{A_b B_c}
}
\end{aligned}$$
Any theory involving an LHV $$\lambda$$ must obey this inequality.
-The problem, however, is that quantum mechanics dictates the expectation values
-for the state $$\Ket{\Psi^{-}}$$:
+The problem, however, is that quantum mechanics dictates
+the expectation values for the state $$\ket{\Psi^{-}}$$:
$$\begin{aligned}
- \Expval{A_a B_b} = - \vec{a} \cdot \vec{b}
+ \expval{A_a B_b} = - \vec{a} \cdot \vec{b}
\end{aligned}$$
-Finding directions which violate the Bell inequality is easy:
+Finding directions that violate the Bell inequality is easy:
for example, if $$\vec{a}$$ and $$\vec{b}$$ are orthogonal,
and $$\vec{c}$$ is at a $$\pi/4$$ angle to both of them,
then the left becomes $$0.707$$ and the right $$0.293$$,
@@ -147,222 +155,6 @@ which clearly disagrees with the inequality,
meaning that LHVs are impossible.
-## CHSH inequality
-
-The **Clauser-Horne-Shimony-Holt** or simply **CHSH inequality**
-takes a slightly different approach, and is more useful in practice.
-
-Consider four spin directions, two for $$A$$ called $$\vec{a}_1$$ and $$\vec{a}_2$$,
-and two for $$B$$ called $$\vec{b}_1$$ and $$\vec{b}_2$$.
-Let us introduce the following abbreviations:
-
-$$\begin{aligned}
- A_1 &= A(\vec{a}_1, \lambda)
- \qquad \quad
- A_2 = A(\vec{a}_2, \lambda)
- \\
- B_1 &= B(\vec{b}_1, \lambda)
- \qquad \quad
- B_2 = B(\vec{b}_2, \lambda)
-\end{aligned}$$
-
-From the definition of the expectation value,
-we know that the difference is given by:
-
-$$\begin{aligned}
- \Expval{A_1 B_1} - \Expval{A_1 B_2}
- = \int \rho(\lambda) \Big( A_1 B_1 - A_1 B_2 \Big) \dd{\lambda}
-\end{aligned}$$
-
-We introduce some new terms and rearrange the resulting expression:
-
-$$\begin{aligned}
- \Expval{A_1 B_1} - \Expval{A_1 B_2}
- &= \int \rho(\lambda) \Big( A_1 B_1 - A_1 B_2 \pm A_1 B_1 A_2 B_2 \mp A_1 B_1 A_2 B_2 \Big) \dd{\lambda}
- \\
- &= \int \rho(\lambda) A_1 B_1 \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
- - \!\int \rho(\lambda) A_1 B_2 \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda}
-\end{aligned}$$
-
-Taking the absolute value of both sides
-and invoking the triangle inequality then yields:
-
-$$\begin{aligned}
- \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
- &= \bigg|\! \int \rho(\lambda) A_1 B_1 \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
- - \!\int \rho(\lambda) A_1 B_2 \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda} \!\bigg|
- \\
- &\le \bigg|\! \int \rho(\lambda) A_1 B_1 \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda} \!\bigg|
- + \bigg|\! \int \rho(\lambda) A_1 B_2 \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda} \!\bigg|
-\end{aligned}$$
-
-Using the fact that the product of $$A$$ and $$B$$ is always either $$-1$$ or $$+1$$,
-we can reduce this to:
-
-$$\begin{aligned}
- \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
- &\le \int \rho(\lambda) \Big| A_1 B_1 \Big| \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
- + \!\int \rho(\lambda) \Big| A_1 B_2 \Big| \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda}
- \\
- &\le \int \rho(\lambda) \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
- + \!\int \rho(\lambda) \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda}
-\end{aligned}$$
-
-Evaluating these integrals gives us the following inequality,
-which holds for both choices of $$\pm$$:
-
-$$\begin{aligned}
- \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
- &\le 2 \pm \Expval{A_2 B_2} \pm \Expval{A_2 B_1}
-\end{aligned}$$
-
-We should choose the signs such that the right-hand side is as small as possible, that is:
-
-$$\begin{aligned}
- \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
- &\le 2 \pm \Big( \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big)
- \\
- &\le 2 - \Big| \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big|
-\end{aligned}$$
-
-Rearranging this and once again using the triangle inequality,
-we get the CHSH inequality:
-
-$$\begin{aligned}
- 2
- &\ge \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big| + \Big| \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big|
- \\
- &\ge \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} + \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big|
-\end{aligned}$$
-
-The quantity on the right-hand side is sometimes called the **CHSH quantity** $$S$$,
-and measures the correlation between the spins of $$A$$ and $$B$$:
-
-$$\begin{aligned}
- \boxed{
- S \equiv \Expval{A_2 B_1} + \Expval{A_2 B_2} + \Expval{A_1 B_1} - \Expval{A_1 B_2}
- }
-\end{aligned}$$
-
-The CHSH inequality places an upper bound on the magnitude of $$S$$
-for LHV-based theories:
-
-$$\begin{aligned}
- \boxed{
- |S| \le 2
- }
-\end{aligned}$$
-
-
-## Tsirelson's bound
-
-Quantum physics can violate the CHSH inequality, but by how much?
-Consider the following two-particle operator,
-whose expectation value is the CHSH quantity, i.e. $$S = \expval{\hat{S}}$$:
-
-$$\begin{aligned}
- \hat{S}
- = \hat{A}_2 \otimes \hat{B}_1 + \hat{A}_2 \otimes \hat{B}_2 + \hat{A}_1 \otimes \hat{B}_1 - \hat{A}_1 \otimes \hat{B}_2
-\end{aligned}$$
-
-Where $$\otimes$$ is the tensor product,
-and e.g. $$\hat{A}_1$$ is the Pauli matrix for the $$\vec{a}_1$$-direction.
-The square of this operator is then given by:
-
-$$\begin{aligned}
- \hat{S}^2
- = \quad &\hat{A}_2^2 \otimes \hat{B}_1^2 + \hat{A}_2^2 \otimes \hat{B}_1 \hat{B}_2
- + \hat{A}_2 \hat{A}_1 \otimes \hat{B}_1^2 - \hat{A}_2 \hat{A}_1 \otimes \hat{B}_1 \hat{B}_2
- \\
- + &\hat{A}_2^2 \otimes \hat{B}_2 \hat{B}_1 + \hat{A}_2^2 \otimes \hat{B}_2^2
- + \hat{A}_2 \hat{A}_1 \otimes \hat{B}_2 \hat{B}_1 - \hat{A}_2 \hat{A}_1 \otimes \hat{B}_2^2
- \\
- + &\hat{A}_1 \hat{A}_2 \otimes \hat{B}_1^2 + \hat{A}_1 \hat{A}_2 \otimes \hat{B}_1 \hat{B}_2
- + \hat{A}_1^2 \otimes \hat{B}_1^2 - \hat{A}_1^2 \otimes \hat{B}_1 \hat{B}_2
- \\
- - &\hat{A}_1 \hat{A}_2 \otimes \hat{B}_2 \hat{B}_1 - \hat{A}_1 \hat{A}_2 \otimes \hat{B}_2^2
- - \hat{A}_1^2 \otimes \hat{B}_2 \hat{B}_1 + \hat{A}_1^2 \otimes \hat{B}_2^2
- \\
- = \quad &\hat{A}_2^2 \otimes \hat{B}_1^2 + \hat{A}_2^2 \otimes \hat{B}_2^2 + \hat{A}_1^2 \otimes \hat{B}_1^2 + \hat{A}_1^2 \otimes \hat{B}_2^2
- \\
- + &\hat{A}_2^2 \otimes \acomm{\hat{B}_1}{\hat{B}_2} - \hat{A}_1^2 \otimes \acomm{\hat{B}_1}{\hat{B}_2}
- + \acomm{\hat{A}_1}{\hat{A}_2} \otimes \hat{B}_1^2 - \acomm{\hat{A}_1}{\hat{A}_2} \otimes \hat{B}_2^2
- \\
- + &\hat{A}_1 \hat{A}_2 \otimes \comm{\hat{B}_1}{\hat{B}_2} - \hat{A}_2 \hat{A}_1 \otimes \comm{\hat{B}_1}{\hat{B}_2}
-\end{aligned}$$
-
-Spin operators are unitary, so their square is the identity,
-e.g. $$\hat{A}_1^2 = \hat{I}$$. Therefore $$\hat{S}^2$$ reduces to:
-
-$$\begin{aligned}
- \hat{S}^2
- &= 4 \: (\hat{I} \otimes \hat{I}) + \comm{\hat{A}_1}{\hat{A}_2} \otimes \comm{\hat{B}_1}{\hat{B}_2}
-\end{aligned}$$
-
-The *norm* $$\norm{\hat{S}^2}$$ of this operator
-is the largest possible expectation value $$\expval{\hat{S}^2}$$,
-which is the same as its largest eigenvalue.
-It is given by:
-
-$$\begin{aligned}
- \Norm{\hat{S}^2}
- &= 4 + \Norm{\comm{\hat{A}_1}{\hat{A}_2} \otimes \comm{\hat{B}_1}{\hat{B}_2}}
- \\
- &\le 4 + \Norm{\comm{\hat{A}_1}{\hat{A}_2}} \Norm{\comm{\hat{B}_1}{\hat{B}_2}}
-\end{aligned}$$
-
-We find a bound for the norm of the commutators by using the triangle inequality, such that:
-
-$$\begin{aligned}
- \Norm{\comm{\hat{A}_1}{\hat{A}_2}}
- = \Norm{\hat{A}_1 \hat{A}_2 - \hat{A}_2 \hat{A}_1}
- \le \Norm{\hat{A}_1 \hat{A}_2} + \Norm{\hat{A}_2 \hat{A}_1}
- \le 2 \Norm{\hat{A}_1 \hat{A}_2}
- \le 2
-\end{aligned}$$
-
-And $$\norm{\comm{\hat{B}_1}{\hat{B}_2}} \le 2$$ for the same reason.
-The norm is the largest eigenvalue, therefore:
-
-$$\begin{aligned}
- \Norm{\hat{S}^2}
- \le 4 + 2 \cdot 2
- = 8
- \quad \implies \quad
- \Norm{\hat{S}}
- \le \sqrt{8}
- = 2 \sqrt{2}
-\end{aligned}$$
-
-We thus arrive at **Tsirelson's bound**,
-which states that quantum mechanics can violate
-the CHSH inequality by a factor of $$\sqrt{2}$$:
-
-$$\begin{aligned}
- \boxed{
- |S|
- \le 2 \sqrt{2}
- }
-\end{aligned}$$
-
-Importantly, this is a *tight* bound,
-meaning that there exist certain spin measurement directions
-for which Tsirelson's bound becomes an equality, for example:
-
-$$\begin{aligned}
- \hat{A}_1 = \hat{\sigma}_z
- \qquad
- \hat{A}_2 = \hat{\sigma}_x
- \qquad
- \hat{B}_1 = \frac{\hat{\sigma}_z + \hat{\sigma}_x}{\sqrt{2}}
- \qquad
- \hat{B}_2 = \frac{\hat{\sigma}_z - \hat{\sigma}_x}{\sqrt{2}}
-\end{aligned}$$
-
-Using the fact that $$\Expval{A_a B_b} = - \vec{a} \cdot \vec{b}$$,
-it can then be shown that $$S = 2 \sqrt{2}$$ in this case.
-
-
## References
1. D.J. Griffiths, D.F. Schroeter,
diff --git a/source/know/concept/chsh-inequality/index.md b/source/know/concept/chsh-inequality/index.md
new file mode 100644
index 0000000..984bae6
--- /dev/null
+++ b/source/know/concept/chsh-inequality/index.md
@@ -0,0 +1,273 @@
+---
+title: "CHSH inequality"
+sort_title: "CHSH inequality"
+date: 2023-02-05
+categories:
+- Physics
+- Quantum mechanics
+- Quantum information
+layout: "concept"
+---
+
+The **Clauser-Horne-Shimony-Holt (CHSH) inequality**
+is an alternative proof of [Bell's theorem](/know/concept/bells-theorem/),
+which takes a slightly different approach
+and is more useful in practice.
+
+Suppose there is a local hidden variable (LHV) $$\lambda$$
+with an unknown probability density $$\rho$$:
+
+$$\begin{aligned}
+ \int \rho(\lambda) \dd{\lambda} = 1
+ \qquad \quad
+ \rho(\lambda) \ge 0
+\end{aligned}$$
+
+Given two spin-1/2 particles $$A$$ and $$B$$,
+measuring their spins along arbitrary directions $$\vec{a}$$ and $$\vec{b}$$
+would give each an eigenvalue $$\pm 1$$. We write this as:
+
+$$\begin{aligned}
+ A(\vec{a}, \lambda) = \pm 1
+ \qquad \quad
+ B(\vec{b}, \lambda) = \pm 1
+\end{aligned}$$
+
+If $$A$$ and $$B$$ start in an entangled [Bell state](/know/concept/bell-state/),
+e.g. $$\ket{\Psi^{-}}$$, then we expect a correlation between their measurements results.
+The product of the outcomes of $$A$$ and $$B$$ is:
+
+$$\begin{aligned}
+ \Expval{A_a B_b}
+ \equiv \int \rho(\lambda) \: A(\vec{a}, \lambda) \: B(\vec{b}, \lambda) \dd{\lambda}
+\end{aligned}$$
+
+So far, we have taken the same path as for proving Bell's inequality,
+but for the CHSH inequality we must now diverge.
+
+
+
+## Deriving the inequality
+
+Consider four spin directions, two for $$A$$ called $$\vec{a}_1$$ and $$\vec{a}_2$$,
+and two for $$B$$ called $$\vec{b}_1$$ and $$\vec{b}_2$$.
+Let us introduce the following abbreviations:
+
+$$\begin{aligned}
+ A_1 \equiv A(\vec{a}_1, \lambda)
+ \qquad \quad
+ A_2 \equiv A(\vec{a}_2, \lambda)
+ \qquad \quad
+ B_1 \equiv B(\vec{b}_1, \lambda)
+ \qquad \quad
+ B_2 \equiv B(\vec{b}_2, \lambda)
+\end{aligned}$$
+
+From the definition of the expectation value,
+we know that the difference is given by:
+
+$$\begin{aligned}
+ \Expval{A_1 B_1} - \Expval{A_1 B_2}
+ = \int \rho(\lambda) \Big( A_1 B_1 - A_1 B_2 \Big) \dd{\lambda}
+\end{aligned}$$
+
+We introduce some new terms and rearrange the resulting expression:
+
+$$\begin{aligned}
+ \Expval{A_1 B_1} - \Expval{A_1 B_2}
+ &= \int \rho(\lambda) \Big( A_1 B_1 - A_1 B_2 \pm A_1 B_1 A_2 B_2 \mp A_1 B_1 A_2 B_2 \Big) \dd{\lambda}
+ \\
+ &= \int \rho(\lambda) A_1 B_1 \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
+ - \!\int \rho(\lambda) A_1 B_2 \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda}
+\end{aligned}$$
+
+Taking the absolute value of both sides
+and invoking the triangle inequality then yields:
+
+$$\begin{aligned}
+ \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
+ &= \bigg|\! \int \rho(\lambda) A_1 B_1 \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
+ - \!\int \rho(\lambda) A_1 B_2 \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda} \!\bigg|
+ \\
+ &\le \bigg|\! \int \rho(\lambda) A_1 B_1 \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda} \!\bigg|
+ + \bigg|\! \int \rho(\lambda) A_1 B_2 \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda} \!\bigg|
+\end{aligned}$$
+
+Using the fact that the product of the spin eigenvalues of $$A$$ and $$B$$
+is always either $$-1$$ or $$+1$$ for all directions,
+we can reduce this to:
+
+$$\begin{aligned}
+ \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
+ &\le \int \rho(\lambda) \Big| A_1 B_1 \Big| \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
+ + \!\int \rho(\lambda) \Big| A_1 B_2 \Big| \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda}
+ \\
+ &\le \int \rho(\lambda) \Big( 1 \pm A_2 B_2 \Big) \dd{\lambda}
+ + \!\int \rho(\lambda) \Big( 1 \pm A_2 B_1 \Big) \dd{\lambda}
+\end{aligned}$$
+
+Evaluating these integrals gives us the following inequality,
+which holds for both choices of $$\pm$$:
+
+$$\begin{aligned}
+ \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
+ &\le 2 \pm \Expval{A_2 B_2} \pm \Expval{A_2 B_1}
+\end{aligned}$$
+
+We should choose the signs such that the right-hand side is as small as possible, that is:
+
+$$\begin{aligned}
+ \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big|
+ &\le 2 \pm \Big( \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big)
+ \\
+ &\le 2 - \Big| \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big|
+\end{aligned}$$
+
+Rearranging this and once again using the triangle inequality,
+we get the CHSH inequality:
+
+$$\begin{aligned}
+ 2
+ &\ge \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} \Big| + \Big| \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big|
+ \\
+ &\ge \Big| \Expval{A_1 B_1} - \Expval{A_1 B_2} + \Expval{A_2 B_2} + \Expval{A_2 B_1} \Big|
+\end{aligned}$$
+
+The quantity on the right-hand side is sometimes called the **CHSH quantity** $$S$$,
+and measures the correlation between the spins of $$A$$ and $$B$$:
+
+$$\begin{aligned}
+ \boxed{
+ S \equiv \Expval{A_2 B_1} + \Expval{A_2 B_2} + \Expval{A_1 B_1} - \Expval{A_1 B_2}
+ }
+\end{aligned}$$
+
+The CHSH inequality places an upper bound on the magnitude of $$S$$
+for LHV-based theories:
+
+$$\begin{aligned}
+ \boxed{
+ |S| \le 2
+ }
+\end{aligned}$$
+
+
+
+## Tsirelson's bound
+
+Quantum physics can violate the CHSH inequality, but by how much?
+Consider the following two-particle operator,
+whose expectation value is the CHSH quantity, i.e. $$S = \expval{\hat{S}}$$:
+
+$$\begin{aligned}
+ \hat{S}
+ = \hat{A}_2 \otimes \hat{B}_1 + \hat{A}_2 \otimes \hat{B}_2 + \hat{A}_1 \otimes \hat{B}_1 - \hat{A}_1 \otimes \hat{B}_2
+\end{aligned}$$
+
+Where $$\otimes$$ is the tensor product,
+and e.g. $$\hat{A}_1$$ is the Pauli matrix for the $$\vec{a}_1$$-direction.
+The square of this operator is then given by:
+
+$$\begin{aligned}
+ \hat{S}^2
+ = \quad &\hat{A}_2^2 \otimes \hat{B}_1^2 + \hat{A}_2^2 \otimes \hat{B}_1 \hat{B}_2
+ + \hat{A}_2 \hat{A}_1 \otimes \hat{B}_1^2 - \hat{A}_2 \hat{A}_1 \otimes \hat{B}_1 \hat{B}_2
+ \\
+ + &\hat{A}_2^2 \otimes \hat{B}_2 \hat{B}_1 + \hat{A}_2^2 \otimes \hat{B}_2^2
+ + \hat{A}_2 \hat{A}_1 \otimes \hat{B}_2 \hat{B}_1 - \hat{A}_2 \hat{A}_1 \otimes \hat{B}_2^2
+ \\
+ + &\hat{A}_1 \hat{A}_2 \otimes \hat{B}_1^2 + \hat{A}_1 \hat{A}_2 \otimes \hat{B}_1 \hat{B}_2
+ + \hat{A}_1^2 \otimes \hat{B}_1^2 - \hat{A}_1^2 \otimes \hat{B}_1 \hat{B}_2
+ \\
+ - &\hat{A}_1 \hat{A}_2 \otimes \hat{B}_2 \hat{B}_1 - \hat{A}_1 \hat{A}_2 \otimes \hat{B}_2^2
+ - \hat{A}_1^2 \otimes \hat{B}_2 \hat{B}_1 + \hat{A}_1^2 \otimes \hat{B}_2^2
+ \\
+ = \quad &\hat{A}_2^2 \otimes \hat{B}_1^2 + \hat{A}_2^2 \otimes \hat{B}_2^2 + \hat{A}_1^2 \otimes \hat{B}_1^2 + \hat{A}_1^2 \otimes \hat{B}_2^2
+ \\
+ + &\hat{A}_2^2 \otimes \acomm{\hat{B}_1}{\hat{B}_2} - \hat{A}_1^2 \otimes \acomm{\hat{B}_1}{\hat{B}_2}
+ + \acomm{\hat{A}_1}{\hat{A}_2} \otimes \hat{B}_1^2 - \acomm{\hat{A}_1}{\hat{A}_2} \otimes \hat{B}_2^2
+ \\
+ + &\hat{A}_1 \hat{A}_2 \otimes \comm{\hat{B}_1}{\hat{B}_2} - \hat{A}_2 \hat{A}_1 \otimes \comm{\hat{B}_1}{\hat{B}_2}
+\end{aligned}$$
+
+Spin operators are unitary, so their square is the identity,
+e.g. $$\hat{A}_1^2 = \hat{I}$$. Therefore $$\hat{S}^2$$ reduces to:
+
+$$\begin{aligned}
+ \hat{S}^2
+ &= 4 \: (\hat{I} \otimes \hat{I}) + \comm{\hat{A}_1}{\hat{A}_2} \otimes \comm{\hat{B}_1}{\hat{B}_2}
+\end{aligned}$$
+
+The *norm* $$\norm{\hat{S}^2}$$ of this operator
+is the largest possible expectation value $$\expval{\hat{S}^2}$$,
+which is the same as its largest eigenvalue.
+It is given by:
+
+$$\begin{aligned}
+ \Norm{\hat{S}^2}
+ &= 4 + \Norm{\comm{\hat{A}_1}{\hat{A}_2} \otimes \comm{\hat{B}_1}{\hat{B}_2}}
+ \\
+ &\le 4 + \Norm{\comm{\hat{A}_1}{\hat{A}_2}} \Norm{\comm{\hat{B}_1}{\hat{B}_2}}
+\end{aligned}$$
+
+We find a bound for the norm of the commutators by using the triangle inequality, such that:
+
+$$\begin{aligned}
+ \Norm{\comm{\hat{A}_1}{\hat{A}_2}}
+ = \Norm{\hat{A}_1 \hat{A}_2 - \hat{A}_2 \hat{A}_1}
+ \le \Norm{\hat{A}_1 \hat{A}_2} + \Norm{\hat{A}_2 \hat{A}_1}
+ \le 2 \Norm{\hat{A}_1 \hat{A}_2}
+ \le 2
+\end{aligned}$$
+
+And $$\norm{\comm{\hat{B}_1}{\hat{B}_2}} \le 2$$ for the same reason.
+The norm is the largest eigenvalue, therefore:
+
+$$\begin{aligned}
+ \Norm{\hat{S}^2}
+ \le 4 + 2 \cdot 2
+ = 8
+ \quad \implies \quad
+ \Norm{\hat{S}}
+ \le \sqrt{8}
+ = 2 \sqrt{2}
+\end{aligned}$$
+
+We thus arrive at **Tsirelson's bound**,
+which states that quantum mechanics can violate
+the CHSH inequality by a factor of $$\sqrt{2}$$:
+
+$$\begin{aligned}
+ \boxed{
+ |S|
+ \le 2 \sqrt{2}
+ }
+\end{aligned}$$
+
+Importantly, this is a *tight* bound,
+meaning that there exist certain spin measurement directions
+for which Tsirelson's bound becomes an equality, for example:
+
+$$\begin{aligned}
+ \hat{A}_1 = \hat{\sigma}_z
+ \qquad
+ \hat{A}_2 = \hat{\sigma}_x
+ \qquad
+ \hat{B}_1 = \frac{\hat{\sigma}_z + \hat{\sigma}_x}{\sqrt{2}}
+ \qquad
+ \hat{B}_2 = \frac{\hat{\sigma}_z - \hat{\sigma}_x}{\sqrt{2}}
+\end{aligned}$$
+
+Fundamental quantum mechanics says that
+$$\Expval{A_a B_b} = - \vec{a} \cdot \vec{b}$$,
+so $$S = 2 \sqrt{2}$$ in this case.
+
+
+
+## References
+1. D.J. Griffiths, D.F. Schroeter,
+ *Introduction to quantum mechanics*, 3rd edition,
+ Cambridge.
+2. J.B. Brask,
+ *Quantum information: lecture notes*,
+ 2021, unpublished.
diff --git a/source/know/concept/lagrange-multiplier/index.md b/source/know/concept/lagrange-multiplier/index.md
index ce5418f..9fb61a8 100644
--- a/source/know/concept/lagrange-multiplier/index.md
+++ b/source/know/concept/lagrange-multiplier/index.md
@@ -102,7 +102,7 @@ by demanding it is stationary:
$$\begin{aligned}
0
- = \nabla \mathcal{L}
+ = \nabla' \mathcal{L}
&= \bigg( \pdv{\mathcal{L}}{x}, \pdv{\mathcal{L}}{y}, \pdv{\mathcal{L}}{\lambda} \bigg)
\\
&= \bigg( \pdv{f}{x} + \lambda \pdv{g}{x}, \:\:\: \pdv{f}{y} + \lambda \pdv{g}{y}, \:\:\: g \bigg)
diff --git a/source/know/concept/pulay-mixing/index.md b/source/know/concept/pulay-mixing/index.md
index 6e809dd..81051f1 100644
--- a/source/know/concept/pulay-mixing/index.md
+++ b/source/know/concept/pulay-mixing/index.md
@@ -8,68 +8,70 @@ layout: "concept"
---
Some numerical problems are most easily solved *iteratively*,
-by generating a series $$\rho_1$$, $$\rho_2$$, etc.
-converging towards the desired solution $$\rho_*$$.
+by generating a series of "solutions" $$f_1$$, $$f_2$$, etc.
+converging towards the true $$f_\infty$$.
**Pulay mixing**, also often called
**direct inversion in the iterative subspace** (DIIS),
can speed up the convergence for some types of problems,
and also helps to avoid periodic divergences.
The key concept it relies on is the **residual vector** $$R_n$$
-of the $$n$$th iteration, which in some way measures the error of the current $$\rho_n$$.
-Its exact definition varies,
-but is generally along the lines of the difference between
-the input of the iteration and the raw resulting output:
+of the $$n$$th iteration, which measures the error of the current $$f_n$$.
+Its exact definition can vary,
+but it is generally the difference between
+the input $$f_n$$ of the $$n$$th iteration
+and the raw resulting output $$f_n^\mathrm{new}$$:
$$\begin{aligned}
R_n
- = R[\rho_n]
- = \rho_n^\mathrm{new}[\rho_n] - \rho_n
+ \equiv R[f_n]
+ \equiv f_n^\mathrm{new}[f_n] - f_n
\end{aligned}$$
-It is not always clear what to do with $$\rho_n^\mathrm{new}$$.
-Directly using it as the next input ($$\rho_{n+1} = \rho_n^\mathrm{new}$$)
+It is not always clear what to do with $$f_n^\mathrm{new}$$.
+Directly using it as the next input ($$f_{n+1} = f_n^\mathrm{new}$$)
often leads to oscillation,
-and linear mixing ($$\rho_{n+1} = (1\!-\!f) \rho_n + f \rho_n^\mathrm{new}$$)
+and linear mixing ($$f_{n+1} = (1\!-\!c) f_n + c f_n^\mathrm{new}$$)
can take a very long time to converge properly.
Pulay mixing offers an improvement.
-The idea is to construct the next iteration's input $$\rho_{n+1}$$
-as a linear combination of the previous inputs $$\rho_1$$, $$\rho_2$$, ..., $$\rho_n$$,
-such that it is as close as possible to the optimal $$\rho_*$$:
+The idea is to construct the next iteration's input $$f_{n+1}$$
+as a linear combination of the previous inputs $$f_1$$, $$f_2$$, ..., $$f_n$$,
+such that it is as close as possible to the optimal $$f_\infty$$:
$$\begin{aligned}
\boxed{
- \rho_{n+1}
- = \sum_{m = 1}^n \alpha_m \rho_m
+ f_{n+1}
+ = \sum_{m = 1}^n \alpha_m f_m
}
\end{aligned}$$
To do so, we make two assumptions.
-Firstly, the current $$\rho_n$$ is already close to $$\rho_*$$,
-so that such a linear combination makes sense.
-Secondly, the iteration is linear,
-such that the raw output $$\rho_{n+1}^\mathrm{new}$$
-is also a linear combination with the *same coefficients*:
+First, that the current $$f_n$$ is already close to $$f_\infty$$,
+so such a linear combination makes sense.
+Second, that the iteration is linear,
+so the raw output $$f_{n+1}^\mathrm{new}$$
+is also a linear combination *with the same coefficients*:
$$\begin{aligned}
- \rho_{n+1}^\mathrm{new}
- = \sum_{m = 1}^n \alpha_m \rho_m^\mathrm{new}
+ f_{n+1}^\mathrm{new}
+ = \sum_{m = 1}^n \alpha_m f_m^\mathrm{new}
\end{aligned}$$
-We will return to these assumptions later.
-The point is that $$R_{n+1}$$ is also a linear combination:
+We will revisit these assumptions later.
+The point is that $$R_{n+1}$$ can now also be written
+as a linear combination of old residuals $$R_m$$:
$$\begin{aligned}
R_{n+1}
- = \rho_{n+1}^\mathrm{new} - \rho_{n+1}
- = \sum_{m = 1}^n \alpha_m \rho_m^\mathrm{new} - \sum_{m = 1}^n \alpha_m \rho_m
+ = f_{n+1}^\mathrm{new} - f_{n+1}
+ = \sum_{m = 1}^n \alpha_m f_m^\mathrm{new} - \sum_{m = 1}^n \alpha_m f_m
= \sum_{m = 1}^n \alpha_m R_m
\end{aligned}$$
The goal is to choose the coefficients $$\alpha_m$$ such that
the norm of the error $$|R_{n+1}| \approx 0$$,
-subject to the following constraint to preserve the normalization of $$\rho_{n+1}$$:
+subject to the following constraint to preserve the normalization of $$f_{n+1}$$:
$$\begin{aligned}
\sum_{m=1}^n \alpha_m = 1
@@ -79,20 +81,19 @@ We thus want to minimize the following quantity,
where $$\lambda$$ is a [Lagrange multiplier](/know/concept/lagrange-multiplier/):
$$\begin{aligned}
- \Inprod{R_{n+1}}{R_{n+1}} + \lambda \sum_{m = 1}^n \alpha_m^*
- = \sum_{m=1}^n \alpha_m^* \Big( \sum_{k=1}^n \alpha_k \Inprod{R_m}{R_k} + \lambda \Big)
+ \inprod{R_{n+1}}{R_{n+1}} + \lambda \sum_{m = 1}^n \alpha_m^*
+ = \sum_{m=1}^n \alpha_m^* \Big( \sum_{k=1}^n \alpha_k \inprod{R_m}{R_k} + \lambda \Big)
\end{aligned}$$
By differentiating the right-hand side with respect to $$\alpha_m^*$$
and demanding that the result is zero,
-we get a system of equations that we can write in matrix form,
-which is cheap to solve:
+we get a cheap-to-solve system of equations, in matrix form:
$$\begin{aligned}
\begin{bmatrix}
- \Inprod{R_1}{R_1} & \cdots & \Inprod{R_1}{R_n} & 1 \\
+ \inprod{R_1}{R_1} & \cdots & \inprod{R_1}{R_n} & 1 \\
\vdots & \ddots & \vdots & \vdots \\
- \Inprod{R_n}{R_1} & \cdots & \Inprod{R_n}{R_n} & 1 \\
+ \inprod{R_n}{R_1} & \cdots & \inprod{R_n}{R_n} & 1 \\
1 & \cdots & 1 & 0
\end{bmatrix}
\cdot
@@ -106,48 +107,50 @@ $$\begin{aligned}
\end{aligned}$$
From this, we can also see that the Lagrange multiplier
-$$\lambda = - \Inprod{R_{n+1}}{R_{n+1}}$$,
+$$\lambda = - \inprod{R_{n+1}}{R_{n+1}}$$,
where $$R_{n+1}$$ is the *predicted* residual of the next iteration,
subject to the two assumptions.
+This fact makes $$\lambda$$ a useful measure of convergence.
-However, in practice, the earlier inputs $$\rho_1$$, $$\rho_2$$, etc.
-are much further from $$\rho_*$$ than $$\rho_n$$,
-so usually only the most recent $$N\!+\!1$$ inputs $$\rho_{n - N}$$, ..., $$\rho_n$$ are used:
+In practice, the earlier inputs $$f_1$$, $$f_2$$, etc.
+are much further from $$f_\infty$$ than $$f_n$$,
+so usually only the most recent $$N\!+\!1$$ inputs $$f_{n - N}, ..., f_n$$ are used.
+This also keeps the matrix small:
$$\begin{aligned}
- \rho_{n+1}
- = \sum_{m = n-N}^n \alpha_m \rho_m
+ f_{n+1}
+ = \sum_{m = n-N}^n \alpha_m f_m
\end{aligned}$$
-You might be confused by the absence of any $$\rho_m^\mathrm{new}$$
-in the creation of $$\rho_{n+1}$$, as if the iteration's outputs are being ignored.
+You might be confused by the absence of any $$f_m^\mathrm{new}$$
+in the creation of $$f_{n+1}$$, as if the iteration's outputs are being ignored.
This is due to the first assumption,
-which states that $$\rho_n^\mathrm{new}$$ and $$\rho_n$$ are already similar,
+which states that $$f_n^\mathrm{new}$$ and $$f_n$$ are already similar,
such that they are basically interchangeable.
-Speaking of which, about those assumptions:
-while they will clearly become more accurate as $$\rho_n$$ approaches $$\rho_*$$,
-they might be very dubious in the beginning.
-A consequence of this is that the early iterations might get "trapped"
-in a suboptimal subspace spanned by $$\rho_1$$, $$\rho_2$$, etc.
-To say it another way, we would be varying $$n$$ coefficients $$\alpha_m$$
-to try to optimize a $$D$$-dimensional $$\rho_{n+1}$$,
-where in general $$D \gg n$$, at least in the beginning.
-
-There is an easy fix to this problem:
-add a small amount of the raw residual $$R_m$$
-to "nudge" $$\rho_{n+1}$$ towards the right subspace,
+Although those assumptions will clearly become more realistic as $$f_n \to f_\infty$$,
+they might be very dubious at first.
+Consequently, the early iterations may get "trapped"
+in a suboptimal subspace spanned by $$f_1$$, $$f_2$$, etc.
+Think of it like this:
+we would be varying up to $$n$$ coefficients $$\alpha_m$$
+to try to optimize a $$D$$-dimensional $$f_{n+1}$$, where usually $$D \gg n$$.
+It is almost impossible to find a decent optimum in this way!
+
+This problem is easy to fix,
+by mixing in a small amount of the raw residuals $$R_m$$
+to "nudge" $$f_{n+1}$$ towards the right subspace,
where $$\beta \in [0,1]$$ is a tunable parameter:
$$\begin{aligned}
\boxed{
- \rho_{n+1}
- = \sum_{m = N}^n \alpha_m (\rho_m + \beta R_m)
+ f_{n+1}
+ = \sum_{m = N}^n \alpha_m (f_m + \beta R_m)
}
\end{aligned}$$
-In other words, we end up introducing a small amount of the raw outputs $$\rho_m^\mathrm{new}$$,
-while still giving more weight to iterations with smaller residuals.
+In this way, the raw outputs $$f_m^\mathrm{new}$$ are (rightfully) included via $$R_m$$,
+but we still give more weight to iterations with smaller residuals.
Pulay mixing is very effective for certain types of problems,
e.g. density functional theory,