From a8d31faecc733fa4d63fde58ab98a5e9d11029c2 Mon Sep 17 00:00:00 2001
From: Prefetch
Date: Sun, 2 Apr 2023 16:57:12 +0200
Subject: Improve knowledge base

---
 .../know/concept/amplitude-rate-equations/index.md | 20 +++----
 .../concept/bernstein-vazirani-algorithm/index.md  |  5 +-
 source/know/concept/blochs-theorem/index.md        | 46 ++++++----------
 source/know/concept/boltzmann-relation/index.md    | 16 +++---
 .../concept/bose-einstein-distribution/index.md    | 41 ++++++++-------
 .../know/concept/fermi-dirac-distribution/index.md | 41 ++++++++-------
 .../concept/hagen-poiseuille-equation/index.md     | 45 ++++++++--------
 source/know/concept/impulse-response/index.md      | 61 ++++++++++++----------
 source/know/concept/lagrange-multiplier/index.md   | 42 +++++++--------
 9 files changed, 152 insertions(+), 165 deletions(-)

(limited to 'source/know/concept')

diff --git a/source/know/concept/amplitude-rate-equations/index.md b/source/know/concept/amplitude-rate-equations/index.md
index 0ca3248..d5eeb0d 100644
--- a/source/know/concept/amplitude-rate-equations/index.md
+++ b/source/know/concept/amplitude-rate-equations/index.md
@@ -9,21 +9,17 @@ layout: "concept"
 ---
 
 In quantum mechanics, the **amplitude rate equations** give
-the evolution of a quantum state's superposition coefficients through time.
-They are known as the precursors for
+the evolution of a quantum state in a time-varying potential.
+Although best known as the precursors of
 [time-dependent perturbation theory](/know/concept/time-dependent-perturbation-theory/),
-but by themselves they are exact and widely applicable.
+by themselves they are exact and widely applicable.
 
-Let $$\hat{H}_0$$ be a "simple" time-independent part
-of the full Hamiltonian,
-and $$\hat{H}_1$$ a time-varying other part,
-whose contribution need not be small:
+Let $$\hat{H}_0$$ be the time-independent part of the total Hamiltonian,
+and $$\hat{H}_1$$ the time-varying part
+(whose contribution need not be small),
+so $$\hat{H}(t) = \hat{H}_0 + \hat{H}_1(t)$$.
 
-$$\begin{aligned}
-    \hat{H}(t) = \hat{H}_0 + \hat{H}_1(t)
-\end{aligned}$$
-
-We assume that the time-independent problem
+Suppose that the time-independent problem
 $$\hat{H}_0 \Ket{n} = E_n \Ket{n}$$ has already been solved,
 such that its general solution is a superposition as follows:
 
diff --git a/source/know/concept/bernstein-vazirani-algorithm/index.md b/source/know/concept/bernstein-vazirani-algorithm/index.md
index 5f224dc..884cca3 100644
--- a/source/know/concept/bernstein-vazirani-algorithm/index.md
+++ b/source/know/concept/bernstein-vazirani-algorithm/index.md
@@ -76,8 +76,9 @@ $$\begin{aligned}
     \frac{1}{\sqrt{2^N}} \sum_{x = 0}^{2^N - 1} (-1)^{s \cdot x} \Ket{x}
 \end{aligned}$$
 
-Then, thanks to the definition of the Hadamard transform,
-a final set of $$H$$-gates leads us to:
+Then, using the definition of the Hadamard transform
+and the fact that it is its own inverse,
+one final set of $$H$$-gates leads us to:
 
 $$\begin{aligned}
     \frac{1}{\sqrt{2^N}} \sum_{x = 0}^{2^N - 1} (-1)^{s \cdot x} \Ket{x}
diff --git a/source/know/concept/blochs-theorem/index.md b/source/know/concept/blochs-theorem/index.md
index 6f445f1..d7fcf90 100644
--- a/source/know/concept/blochs-theorem/index.md
+++ b/source/know/concept/blochs-theorem/index.md
@@ -17,85 +17,72 @@ take the following form,
 where the function $$u(\vb{r})$$ is periodic on the same lattice,
 i.e. $$u(\vb{r}) = u(\vb{r} + \vb{a})$$:
 
-$$
-\begin{aligned}
+$$\begin{aligned}
 	\boxed{
 		\psi(\vb{r}) = u(\vb{r}) e^{i \vb{k} \cdot \vb{r}}
 	}
-\end{aligned}
-$$
+\end{aligned}$$
 
 In other words, in a periodic potential,
 the solutions are simply plane waves with a periodic modulation,
 known as **Bloch functions** or **Bloch states**.
 
-This is suprisingly easy to prove:
+This is surprisingly easy to prove:
 if the Hamiltonian $$\hat{H}$$ is lattice-periodic,
 then both $$\psi(\vb{r})$$ and $$\psi(\vb{r} + \vb{a})$$
 are eigenstates with the same energy:
 
-$$
-\begin{aligned}
+$$\begin{aligned}
 	\hat{H} \psi(\vb{r}) = E \psi(\vb{r})
 	\qquad
 	\hat{H} \psi(\vb{r} + \vb{a}) = E \psi(\vb{r} + \vb{a})
-\end{aligned}
-$$
+\end{aligned}$$
 
 Now define the unitary translation operator $$\hat{T}(\vb{a})$$ such that
 $$\psi(\vb{r} + \vb{a}) = \hat{T}(\vb{a}) \psi(\vb{r})$$.
 From the previous equation, we then know that:
 
-$$
-\begin{aligned}
+$$\begin{aligned}
 	\hat{H} \hat{T}(\vb{a}) \psi(\vb{r})
 	= E \hat{T}(\vb{a}) \psi(\vb{r})
 	= \hat{T}(\vb{a}) \big(E \psi(\vb{r})\big)
 	= \hat{T}(\vb{a}) \hat{H} \psi(\vb{r})
-\end{aligned}
-$$
+\end{aligned}$$
 
 In other words, if $$\hat{H}$$ is lattice-periodic,
 then it will commute with $$\hat{T}(\vb{a})$$,
 i.e. $$[\hat{H}, \hat{T}(\vb{a})] = 0$$.
 Consequently, $$\hat{H}$$ and $$\hat{T}(\vb{a})$$ must share eigenstates $$\psi(\vb{r})$$:
 
-$$
-\begin{aligned}
+$$\begin{aligned}
 	\hat{H} \:\psi(\vb{r}) = E \:\psi(\vb{r})
 	\qquad \qquad
 	\hat{T}(\vb{a}) \:\psi(\vb{r}) = \tau \:\psi(\vb{r})
-\end{aligned}
-$$
+\end{aligned}$$
 
 Since $$\hat{T}$$ is unitary,
 its eigenvalues $$\tau$$ must have the form $$e^{i \theta}$$, with $$\theta$$ real.
 Therefore a translation by $$\vb{a}$$ causes a phase shift,
 for some vector $$\vb{k}$$:
 
-$$
-\begin{aligned}
+$$\begin{aligned}
 	\psi(\vb{r} + \vb{a})
 	= \hat{T}(\vb{a}) \:\psi(\vb{r})
 	= e^{i \theta} \:\psi(\vb{r})
 	= e^{i \vb{k} \cdot \vb{a}} \:\psi(\vb{r})
-\end{aligned}
-$$
+\end{aligned}$$
 
 Let us now define the following function,
 keeping our arbitrary choice of $$\vb{k}$$:
 
-$$
-\begin{aligned}
+$$\begin{aligned}
 	u(\vb{r})
-	= e^{- i \vb{k} \cdot \vb{r}} \:\psi(\vb{r})
-\end{aligned}
-$$
+	\equiv e^{- i \vb{k} \cdot \vb{r}} \:\psi(\vb{r})
+\end{aligned}$$
 
 As it turns out, this function is guaranteed to be lattice-periodic for any $$\vb{k}$$:
 
-$$
-\begin{aligned}
+$$\begin{aligned}
 	u(\vb{r} + \vb{a})
 	&= e^{- i \vb{k} \cdot (\vb{r} + \vb{a})} \:\psi(\vb{r} + \vb{a})
 	\\
@@ -104,8 +91,7 @@ $$
 	&= e^{- i \vb{k} \cdot \vb{r}} \:\psi(\vb{r})
 	\\
 	&= u(\vb{r})
-\end{aligned}
-$$
+\end{aligned}$$
 
 Then Bloch's theorem follows from
 isolating the definition of $$u(\vb{r})$$ for $$\psi(\vb{r})$$.
diff --git a/source/know/concept/boltzmann-relation/index.md b/source/know/concept/boltzmann-relation/index.md
index b528adf..b3634f3 100644
--- a/source/know/concept/boltzmann-relation/index.md
+++ b/source/know/concept/boltzmann-relation/index.md
@@ -8,15 +8,16 @@ categories:
 layout: "concept"
 ---
 
-In a plasma where the ions and electrons are both in thermal equilibrium,
-and in the absence of short-lived induced electromagnetic fields,
-their densities $$n_i$$ and $$n_e$$ can be predicted.
+In a plasma where the ions and electrons are in thermal equilibrium,
+in the absence of short-lived induced electromagnetic fields,
+the densities $$n_i$$ and $$n_e$$ can be predicted.
 
-By definition, a particle in an [electric field](/know/concept/electric-field/) $$\vb{E}$$
+By definition, a charged particle in
+an [electric field](/know/concept/electric-field/) $$\vb{E} = - \nabla \phi$$
 experiences a [Lorentz force](/know/concept/lorentz-force/) $$\vb{F}_e$$.
 This corresponds to a force density $$\vb{f}_e$$,
 such that $$\vb{F}_e = \vb{f}_e \dd{V}$$.
-For the electrons, we thus have:
+For electrons:
 
 $$\begin{aligned}
     \vb{f}_e
@@ -74,10 +75,9 @@ $$\begin{aligned}
 But due to their large mass,
 ions respond much slower to fluctuations in the above equilibrium.
 Consequently, after a perturbation,
-the ions spend more time in a transient non-equilibrium state
+the ions spend more time in a non-equilibrium state
 than the electrons, so this formula for $$n_i$$ is only valid
-if the perturbation is sufficiently slow,
-such that the ions can keep up.
+if the perturbation is sufficiently slow, such that the ions can keep up.
 Usually, electrons do not suffer the same issue,
 thanks to their small mass and hence fast response.
 
diff --git a/source/know/concept/bose-einstein-distribution/index.md b/source/know/concept/bose-einstein-distribution/index.md
index e420d7c..5640e69 100644
--- a/source/know/concept/bose-einstein-distribution/index.md
+++ b/source/know/concept/bose-einstein-distribution/index.md
@@ -11,21 +11,22 @@ layout: "concept"
 
 **Bose-Einstein statistics** describe how bosons,
 which do not obey the [Pauli exclusion principle](/know/concept/pauli-exclusion-principle/),
-will distribute themselves across the available states
+distribute themselves across the available states
 in a system at equilibrium.
 
 Consider a single-particle state $$s$$,
 which can contain any number of bosons.
 Since the occupation number $$N$$ is variable,
-we turn to the [grand canonical ensemble](/know/concept/grand-canonical-ensemble/),
-whose grand partition function $$\mathcal{Z}$$ is as follows,
+we use the [grand canonical ensemble](/know/concept/grand-canonical-ensemble/),
+whose grand partition function $$\mathcal{Z}$$ is as shown below,
 where $$\varepsilon$$ is the energy per particle,
-and $$\mu$$ is the chemical potential:
+and $$\mu$$ is the chemical potential.
+We evaluate the sum in $$\mathcal{Z}$$ as a geometric series:
 
 $$\begin{aligned}
     \mathcal{Z}
-    = \sum_{N = 0}^\infty \Big( \exp(- \beta (\varepsilon - \mu)) \Big)^{N}
-    = \frac{1}{1 - \exp(- \beta (\varepsilon - \mu))}
+    = \sum_{N = 0}^\infty \Big( e^{-\beta (\varepsilon - \mu)} \Big)^{N}
+    = \frac{1}{1 - e^{-\beta (\varepsilon - \mu)}}
 \end{aligned}$$
 
 The corresponding [thermodynamic potential](/know/concept/thermodynamic-potential/)
@@ -34,41 +35,45 @@ is the Landau potential $$\Omega$$, given by:
 $$\begin{aligned}
     \Omega
     = - k T \ln{\mathcal{Z}}
-    = k T \ln\!\Big( 1 - \exp(- \beta (\varepsilon - \mu)) \Big)
+    = k T \ln\!\big( 1 - e^{-\beta (\varepsilon - \mu)} \big)
 \end{aligned}$$
 
-The average number of particles $$\Expval{N}$$
-is found by taking a derivative of $$\Omega$$:
+The average number of particles $$\expval{N}$$ in $$s$$
+is then found by taking a derivative of $$\Omega$$:
 
 $$\begin{aligned}
-    \Expval{N}
+    \expval{N}
     = - \pdv{\Omega}{\mu}
     = k T \pdv{\ln{\mathcal{Z}}}{\mu}
-    = \frac{\exp(- \beta (\varepsilon - \mu))}{1 - \exp(- \beta (\varepsilon - \mu))}
+    = \frac{e^{-\beta (\varepsilon - \mu)}}{1 - e^{-\beta (\varepsilon - \mu)}}
 \end{aligned}$$
 
-By multitplying both the numerator and the denominator by $$\exp(\beta(\varepsilon \!-\! \mu))$$,
+By multiplying both the numerator and the denominator by $$e^{\beta(\varepsilon \!-\! \mu)}$$,
 we arrive at the standard form of the **Bose-Einstein distribution** $$f_B$$:
 
 $$\begin{aligned}
     \boxed{
-        \Expval{N}
+        \expval{N}
         = f_B(\varepsilon)
-        = \frac{1}{\exp(\beta (\varepsilon - \mu)) - 1}
+        = \frac{1}{e^{\beta (\varepsilon - \mu)} - 1}
     }
 \end{aligned}$$
 
-This tells the expected occupation number $$\Expval{N}$$ of state $$s$$,
+This gives the expected occupation number $$\expval{N}$$
+of state $$s$$ with energy $$\varepsilon$$,
 given a temperature $$T$$ and chemical potential $$\mu$$.
-The corresponding variance $$\sigma^2$$ of $$N$$ is found to be:
+
+{% comment %}
+The corresponding variance $$\sigma^2 \equiv \expval{N^2} - \expval{N}^2$$ is found to be:
 
 $$\begin{aligned}
     \boxed{
         \sigma^2
-        = k T \pdv{\Expval{N}}{\mu}
-        = \Expval{N} \big(1 + \Expval{N}\big)
+        = k T \pdv{\expval{N}}{\mu}
+        = \expval{N} \big(1 + \expval{N}\!\big)
     }
 \end{aligned}$$
+{% endcomment %}
 
 
 
diff --git a/source/know/concept/fermi-dirac-distribution/index.md b/source/know/concept/fermi-dirac-distribution/index.md
index 09a3e76..2a38eb3 100644
--- a/source/know/concept/fermi-dirac-distribution/index.md
+++ b/source/know/concept/fermi-dirac-distribution/index.md
@@ -11,67 +11,68 @@ layout: "concept"
 
 **Fermi-Dirac statistics** describe how identical **fermions**,
 which obey the [Pauli exclusion principle](/know/concept/pauli-exclusion-principle/),
-will distribute themselves across the available states in a system at equilibrium.
+distribute themselves across the available states in a system at equilibrium.
 
 Consider one single-particle state $$s$$,
 which can contain $$0$$ or $$1$$ fermions.
 Because the occupation number $$N$$ is variable,
 we turn to the [grand canonical ensemble](/know/concept/grand-canonical-ensemble/),
 whose grand partition function $$\mathcal{Z}$$ is as follows,
-where we sum over all microstates of $$s$$:
+where $$\varepsilon$$ is the energy of $$s$$
+and $$\mu$$ is the chemical potential:
 
 $$\begin{aligned}
     \mathcal{Z}
-    = \sum_{N = 0}^1 \exp(- \beta N (\varepsilon - \mu))
-    = 1 + \exp(- \beta (\varepsilon - \mu))
+    = \sum_{N = 0}^1 \Big( e^{-\beta (\varepsilon - \mu)} \Big)^N
+    = 1 + e^{-\beta (\varepsilon - \mu)}
 \end{aligned}$$
 
-Where $$\mu$$ is the chemical potential,
-and $$\varepsilon$$ is the energy contribution per particle in $$s$$,
-i.e. the total energy of all particles $$E = \varepsilon N$$.
-
 The corresponding [thermodynamic potential](/know/concept/thermodynamic-potential/)
 is the Landau potential $$\Omega$$, given by:
 
 $$\begin{aligned}
     \Omega
     = - k T \ln{\mathcal{Z}}
-    = - k T \ln\!\Big( 1 + \exp(- \beta (\varepsilon - \mu)) \Big)
+    = - k T \ln\!\Big( 1 + e^{-\beta (\varepsilon - \mu)} \Big)
 \end{aligned}$$
 
-The average number of particles $$\Expval{N}$$
-in state $$s$$ is then found to be as follows:
+The average number of particles $$\expval{N}$$
+in $$s$$ is then found by taking a derivative of $$\Omega$$:
 
 $$\begin{aligned}
-    \Expval{N}
+    \expval{N}
     = - \pdv{\Omega}{\mu}
     = k T \pdv{\ln{\mathcal{Z}}}{\mu}
-    = \frac{\exp(- \beta (\varepsilon - \mu))}{1 + \exp(- \beta (\varepsilon - \mu))}
+    = \frac{e^{-\beta (\varepsilon - \mu)}}{1 + e^{-\beta (\varepsilon - \mu)}}
 \end{aligned}$$
 
-By multiplying both the numerator and the denominator by $$\exp(\beta (\varepsilon \!-\! \mu))$$,
+By multiplying both the numerator and the denominator by $$e^{\beta (\varepsilon \!-\! \mu)}$$,
 we arrive at the standard form of
 the **Fermi-Dirac distribution** or **Fermi function** $$f_F$$:
 
 $$\begin{aligned}
     \boxed{
-        \Expval{N}
+        \expval{N}
         = f_F(\varepsilon)
-        = \frac{1}{\exp(\beta (\varepsilon - \mu)) + 1}
+        = \frac{1}{e^{\beta (\varepsilon - \mu)} + 1}
     }
 \end{aligned}$$
 
-This tells the expected occupation number $$\Expval{N}$$ of state $$s$$,
+This gives the expected occupation number $$\expval{N}$$
+of state $$s$$ with energy $$\varepsilon$$,
 given a temperature $$T$$ and chemical potential $$\mu$$.
-The corresponding variance $$\sigma^2$$ of $$N$$ is found to be:
+
+{% comment %}
+The corresponding variance $$\sigma^2 \equiv \expval{N^2} - \expval{N}^2$$ is found to be:
 
 $$\begin{aligned}
     \boxed{
         \sigma^2
-        = k T \pdv{\Expval{N}}{\mu}
-        = \Expval{N} \big(1 - \Expval{N}\big)
+        = k T \pdv{\expval{N}}{\mu}
+        = \expval{N} \big(1 - \expval{N}\big)
     }
 \end{aligned}$$
+{% endcomment %}
 
 
 
diff --git a/source/know/concept/hagen-poiseuille-equation/index.md b/source/know/concept/hagen-poiseuille-equation/index.md
index 6484631..52d3ce8 100644
--- a/source/know/concept/hagen-poiseuille-equation/index.md
+++ b/source/know/concept/hagen-poiseuille-equation/index.md
@@ -11,9 +11,8 @@ layout: "concept"
 
 The **Hagen-Poiseuille equation**, or simply the **Poiseuille equation**,
 describes the flow of a fluid with nonzero [viscosity](/know/concept/viscosity/)
-through a cylindrical pipe.
-Due to its viscosity, the fluid clings to the sides,
-limiting the amount that can pass through, for a pipe with radius $$R$$.
+through a cylindrical pipe: the fluid clings to the sides,
+limiting the amount that can pass through per unit time.
 
 Consider the [Navier-Stokes equations](/know/concept/navier-stokes-equations/)
 of an incompressible fluid with spatially uniform density $$\rho$$.
@@ -27,13 +26,12 @@ $$\begin{aligned}
     \nabla \cdot \va{v} = 0
 \end{aligned}$$
 
-Into this, we insert the ansatz $$\va{v} = \vu{e}_z \: v_z(r)$$,
-where $$\vu{e}_z$$ is the $$z$$-axis' unit vector.
-In other words, we assume that the flow velocity depends only on $$r$$;
-not on $$\phi$$ or $$z$$.
-Plugging this into the Navier-Stokes equations,
-$$\nabla \cdot \va{v}$$ is trivially zero,
-and in the other equation we multiply out $$\rho$$, yielding this,
+Let the pipe have radius $$R$$, and be infinitely long and parallel to the $$z$$-axis.
+We insert the ansatz $$\va{v} = \vu{e}_z \: v_z(r)$$,
+where $$\vu{e}_z$$ is the $$z$$-axis' unit vector,
+and we are assuming that the flow depends only on $$r$$, not on $$\phi$$ or $$z$$.
+With this, $$\nabla \cdot \va{v}$$ trivially vanishes,
+and in the main equation multiplying out $$\rho$$ yields this,
 where $$\eta = \rho \nu$$ is the dynamic viscosity:
 
 $$\begin{aligned}
@@ -56,7 +54,7 @@ $$\begin{aligned}
     = - G
 \end{aligned}$$
 
-The former equation, for $$p(z)$$, is easy to solve.
+The former equation for $$p(z)$$ is easy to solve.
 We get an integration constant $$p(0)$$:
 
 $$\begin{aligned}
@@ -64,13 +62,12 @@ $$\begin{aligned}
     = p(0) - G z
 \end{aligned}$$
 
-This gives meaning to the **pressure gradient** $$G$$:
-for a pipe of length $$L$$,
-it describes the pressure difference $$\Delta p = p(0) - p(L)$$
-that is driving the fluid,
-i.e. $$G = \Delta p / L$$
+This gives meaning to $$G$$: it is the **pressure gradient**,
+which for a pipe of length $$L$$
+describes the pressure difference $$\Delta p = p(0) - p(L)$$
+that is driving the fluid, i.e. $$G = \Delta p / L$$.
 
-As for the latter equation, for $$v_z(r)$$,
+As for the latter equation for $$v_z(r)$$,
 we start by integrating it once, introducing a constant $$A$$:
 
 $$\begin{aligned}
@@ -148,8 +145,8 @@ $$\begin{aligned}
     = \pi R^2 L G
 \end{aligned}$$
 
-We would like to get rid of $$G$$ for being impractical,
-so we substitute $$R^2 G = 8 \eta \Expval{v_z}$$, yielding:
+$$G$$ is an inconvenient quantity here, so we remove it
+by substituting $$R^2 G = 8 \eta \Expval{v_z}$$:
 
 $$\begin{aligned}
     \boxed{
@@ -159,8 +156,8 @@ $$\begin{aligned}
 \end{aligned}$$
 
 Due to this drag, the pressure difference $$\Delta p = p(0) - p(L)$$
-does work on the fluid, at a rate $$P$$,
-since power equals force (i.e. pressure times area) times velocity:
+does work on the fluid at a rate $$P$$.
+Since power equals force (i.e. pressure times area) times velocity:
 
 $$\begin{aligned}
     P
@@ -179,14 +176,14 @@ $$\begin{aligned}
     = D \Expval{v_z}
 \end{aligned}$$
 
-In conclusion, the power $$P$$,
-needed to drive a fluid through the pipe at a rate $$Q$$,
-is given by:
+In conclusion, the power $$P$$ needed to drive a fluid
+through the pipe at a rate $$Q$$ is given by:
 
 $$\begin{aligned}
     \boxed{
         P
         = 8 \pi \eta L \Expval{v_z}^2
+        = \frac{8 \eta L}{\pi R^4} Q^2
     }
 \end{aligned}$$
 
diff --git a/source/know/concept/impulse-response/index.md b/source/know/concept/impulse-response/index.md
index 661ed3f..8210f5c 100644
--- a/source/know/concept/impulse-response/index.md
+++ b/source/know/concept/impulse-response/index.md
@@ -8,68 +8,75 @@ categories:
 layout: "concept"
 ---
 
-The **impulse response** $$u_p(t)$$ of a system whose behaviour is described
-by a linear operator $$\hat{L}$$, is defined as the reponse of the system
+Given a system whose behaviour is described by a linear operator $$\hat{L}$$,
+its **impulse response** $$u_\delta(t)$$ is defined as the system's response
 when forced by the [Dirac delta function](/know/concept/dirac-delta-function/) $$\delta(t)$$:
 
 $$\begin{aligned}
     \boxed{
-        \hat{L} \{ u_p(t) \} = \delta(t)
+        \hat{L} \{ u_\delta(t) \}
+        = \delta(t)
     }
 \end{aligned}$$
 
-This can be used to find the response $$u(t)$$ of $$\hat{L}$$ to
-*any* forcing function $$f(t)$$, i.e. not only $$\delta(t)$$,
-by simply taking the convolution with $$u_p(t)$$:
+This can be used to find the response $$u(t)$$ of $$\hat{L}$$
+to *any* forcing function $$f(t)$$,
+by simply taking the convolution with $$u_\delta(t)$$:
 
 $$\begin{aligned}
-    \hat{L} \{ u(t) \} = f(t)
+    \hat{L} \{ u(t) \}
+    = f(t)
     \quad \implies \quad
     \boxed{
-        u(t) = (f * u_p)(t)
+        u(t)
+        = (f * u_\delta)(t)
     }
 \end{aligned}$$
 
 
 {% include proof/start.html id="proof-theorem" -%}
-Starting from the definition of $$u_p(t)$$,
+Starting from the definition of $$u_\delta(t)$$,
 we shift the argument by some constant $$\tau$$,
-and multiply both sides by the constant $$f(\tau)$$:
+and multiply both sides by $$f(\tau)$$:
 
 $$\begin{aligned}
-    \hat{L} \{ u_p(t - \tau) \} &= \delta(t - \tau)
+    \hat{L} \{ u_\delta(t - \tau) \}
+    &= \delta(t - \tau)
     \\
-    \hat{L} \{ f(\tau) \: u_p(t - \tau) \} &= f(\tau) \: \delta(t - \tau)
+    \hat{L} \{ f(\tau) \: u_\delta(t - \tau) \}
+    &= f(\tau) \: \delta(t - \tau)
 \end{aligned}$$
 
-Where $$f(\tau)$$ can be moved inside using the
-linearity of $$\hat{L}$$. Integrating over $$\tau$$ then gives us:
+Where $$f(\tau)$$ was moved inside thanks to the linearity of $$\hat{L}$$.
+Integrating over $$\tau$$ gives us:
 
 $$\begin{aligned}
-    \int_0^\infty \hat{L} \{ f(\tau) \: u_p(t - \tau) \} \dd{\tau}
+    \int_0^\infty \hat{L} \{ f(\tau) \: u_\delta(t - \tau) \} \dd{\tau}
     &= \int_0^\infty f(\tau) \: \delta(t - \tau) \dd{\tau}
     = f(t)
 \end{aligned}$$
 
-The integral and $$\hat{L}$$ are operators of different variables, so we reorder them:
+The integral and $$\hat{L}$$ are operators of different variables, so we reorder them,
+and recognize that the resulting integral is a convolution:
 
 $$\begin{aligned}
-    \hat{L} \int_0^\infty f(\tau) \: u_p(t - \tau) \dd{\tau}
-    &= (f * u_p)(t) = \hat{L}\{ u(t) \} = f(t)
+    f(t)
+    &= \hat{L} \int_0^\infty f(\tau) \: u_\delta(t - \tau) \dd{\tau}
+    = \hat{L} \Big\{ (f * u_\delta)(t) \Big\}
 \end{aligned}$$
+
+Because $$\hat{L} \{ u(t) \} = f(t)$$ by definition,
+we then see that $$(f * u_\delta)(t) = u(t)$$.
 {% include proof/end.html id="proof-theorem" %}
 
 
 This is useful for solving initial value problems,
-because any initial condition can be satisfied
-due to the linearity of $$\hat{L}$$,
-by choosing the initial values of the homogeneous solution $$\hat{L}\{ u_h(t) \} = 0$$
-such that the total solution $$(f * u_p)(t) + u_h(t)$$
-has the desired values.
-
-Meanwhile, for boundary value problems,
-the related [fundamental solution](/know/concept/fundamental-solution/)
-is preferable.
+because any initial condition can be satisfied thanks to linearity,
+by choosing the initial values of the homogeneous solution $$\hat{L}\{ u_0(t) \} = 0$$
+such that the total solution $$(f * u_\delta)(t) + u_0(t)$$ has the desired values.
+
+For boundary value problems, there is the related concept of
+a [fundamental solution](/know/concept/fundamental-solution/).
 
 
 
diff --git a/source/know/concept/lagrange-multiplier/index.md b/source/know/concept/lagrange-multiplier/index.md
index 9fb61a8..6b5e3fc 100644
--- a/source/know/concept/lagrange-multiplier/index.md
+++ b/source/know/concept/lagrange-multiplier/index.md
@@ -14,18 +14,18 @@ a function $$f$$ subject to **equality constraints**.
 For example, in 2D, the goal is to maximize/minimize $$f(x, y)$$
 while satisfying $$g(x, y) = 0$$.
 We assume that $$f$$ and $$g$$ are both continuous
-and have continuous first derivatives,
-and that their domain is all of $$\mathbb{R}$$.
+and have continuous first derivatives
+on all of $$\mathbb{R}^2$$.
 
-Side note: many authors write that Lagrange multipliers
+Note: many authors write that Lagrange multipliers
 can be used for constraints of the form $$g(x, y) = c$$ for a constant $$c$$.
-However, this method technically requires $$c = 0$$.
-This issue is easy to solve: given $$g = c$$,
+Actually, the method requires $$c = 0$$,
+but this issue is easy to solve: given $$g = c$$,
 simply define $$\tilde{g} \equiv g - c = 0$$
 and use that as constraint instead.
 
-Before introducing $$g$$,
-optimizing $$f$$ comes down to finding its stationary points:
+So, we want to optimize $$f$$.
+If we ignore $$g$$, that just means finding its stationary points:
 
 $$\begin{aligned}
     0
@@ -36,20 +36,18 @@ $$\begin{aligned}
 This problem is easy: the two dimensions can be handled independently,
 so all we need to do is find the roots of the partial derivatives.
 
-However, adding $$g$$ makes the problem much more complicated:
+However, a constraint $$g = 0$$ makes the problem much more complicated:
 points with $$\nabla f = 0$$ might not satisfy $$g = 0$$,
 and points where $$g = 0$$ might not have $$\nabla f = 0$$.
 The dimensions also cannot be handled independently anymore,
-since they are implicitly related by $$g$$.
+since they are implicitly related via $$g$$.
 
 Imagine a contour plot of $$g(x, y)$$.
 The trick is this: if we follow a contour of $$g = 0$$,
 the highest and lowest values of $$f$$ along the way
 are the desired local extrema.
-Recall our assumption that $$\nabla f$$ is continuous:
-hence *along our contour* $$f$$ is slowly-varying
-in the close vicinity of each such point,
-and stationary at the point itself.
+At each such extremum, $$f$$ must be stationary from the contour's point of view,
+and slowly-varying in its close vicinity since $$\nabla f$$ is continuous.
 We thus have two categories of extrema:
 
 1.  $$\nabla f = 0$$ there,
@@ -57,7 +55,7 @@ We thus have two categories of extrema:
     In other words, a stationary point of $$f$$
     coincidentally lies on a contour of $$g = 0$$.
 
-2.  The contours of $$f$$ and $$g$$ are parallel around the point.
+2.  The contours of $$f$$ and $$g$$ are parallel at the point.
     By definition, $$f$$ is stationary along each of its contours,
     so when we find that $$f$$ is stationary at a point on our $$g = 0$$ path,
     it means we touched a contour of $$f$$.
@@ -83,7 +81,7 @@ $$\begin{aligned}
 Where $$\lambda$$ is the **Lagrange multiplier**
 that quantifies the difference in magnitude between the gradients.
 By setting $$\lambda = 0$$, this equation also handles the 1st category $$\nabla f = 0$$.
-Some authors define $$\lambda$$ with the opposite sign.
+Note that some authors define $$\lambda$$ with the opposite sign.
 
 The method of Lagrange multipliers uses these facts
 to rewrite a constrained $$N$$-dimensional optimization problem
@@ -97,8 +95,7 @@ $$\begin{aligned}
     }
 \end{aligned}$$
 
-Let us do an unconstrained optimization of $$\mathcal{L}$$ as usual,
-by demanding it is stationary:
+Look what happens when we do an unconstrained optimization of $$\mathcal{L}$$ in the usual way:
 
 $$\begin{aligned}
     0
@@ -110,14 +107,11 @@ $$\begin{aligned}
 
 The last item in this vector represents $$g = 0$$,
 and the others $$\nabla f = -\lambda \nabla g$$ as discussed earlier.
-To solve this equation,
-we assign $$\lambda$$ a value that agrees with it
-(such a value exists for each local extremum
-according to our above discussion of the two categories),
-and then find the locations $$(x, y)$$ that satisfy it.
-However, as usual for optimization problems,
+When this unconstrained problem is solved using standard methods,
+the resulting solutions also satisfy the constrained problem.
+However, as usual in the field of optimization,
 this method only finds *local* extrema *and* saddle points;
-it is a necessary condition for optimality, but not sufficient.
+it represents a necessary condition for optimality, but not a sufficient one.
 
 We often assign $$\lambda$$ an algebraic expression rather than a value,
 usually without even bothering to calculate its final actual value.
-- 
cgit v1.2.3