From a8d31faecc733fa4d63fde58ab98a5e9d11029c2 Mon Sep 17 00:00:00 2001 From: Prefetch Date: Sun, 2 Apr 2023 16:57:12 +0200 Subject: Improve knowledge base --- .../know/concept/amplitude-rate-equations/index.md | 20 +++---- .../concept/bernstein-vazirani-algorithm/index.md | 5 +- source/know/concept/blochs-theorem/index.md | 46 ++++++---------- source/know/concept/boltzmann-relation/index.md | 16 +++--- .../concept/bose-einstein-distribution/index.md | 41 ++++++++------- .../know/concept/fermi-dirac-distribution/index.md | 41 ++++++++------- .../concept/hagen-poiseuille-equation/index.md | 45 ++++++++-------- source/know/concept/impulse-response/index.md | 61 ++++++++++++---------- source/know/concept/lagrange-multiplier/index.md | 42 +++++++-------- 9 files changed, 152 insertions(+), 165 deletions(-) (limited to 'source/know/concept') diff --git a/source/know/concept/amplitude-rate-equations/index.md b/source/know/concept/amplitude-rate-equations/index.md index 0ca3248..d5eeb0d 100644 --- a/source/know/concept/amplitude-rate-equations/index.md +++ b/source/know/concept/amplitude-rate-equations/index.md @@ -9,21 +9,17 @@ layout: "concept" --- In quantum mechanics, the **amplitude rate equations** give -the evolution of a quantum state's superposition coefficients through time. -They are known as the precursors for +the evolution of a quantum state in a time-varying potential. +Although best known as the precursors of [time-dependent perturbation theory](/know/concept/time-dependent-perturbation-theory/), -but by themselves they are exact and widely applicable. +by themselves they are exact and widely applicable. -Let $$\hat{H}_0$$ be a "simple" time-independent part -of the full Hamiltonian, -and $$\hat{H}_1$$ a time-varying other part, -whose contribution need not be small: +Let $$\hat{H}_0$$ be the time-independent part of the total Hamiltonian, +and $$\hat{H}_1$$ the time-varying part +(whose contribution need not be small), +so $$\hat{H}(t) = \hat{H}_0 + \hat{H}_1(t)$$. -$$\begin{aligned} - \hat{H}(t) = \hat{H}_0 + \hat{H}_1(t) -\end{aligned}$$ - -We assume that the time-independent problem +Suppose that the time-independent problem $$\hat{H}_0 \Ket{n} = E_n \Ket{n}$$ has already been solved, such that its general solution is a superposition as follows: diff --git a/source/know/concept/bernstein-vazirani-algorithm/index.md b/source/know/concept/bernstein-vazirani-algorithm/index.md index 5f224dc..884cca3 100644 --- a/source/know/concept/bernstein-vazirani-algorithm/index.md +++ b/source/know/concept/bernstein-vazirani-algorithm/index.md @@ -76,8 +76,9 @@ $$\begin{aligned} \frac{1}{\sqrt{2^N}} \sum_{x = 0}^{2^N - 1} (-1)^{s \cdot x} \Ket{x} \end{aligned}$$ -Then, thanks to the definition of the Hadamard transform, -a final set of $$H$$-gates leads us to: +Then, using the definition of the Hadamard transform +and the fact that it is its own inverse, +one final set of $$H$$-gates leads us to: $$\begin{aligned} \frac{1}{\sqrt{2^N}} \sum_{x = 0}^{2^N - 1} (-1)^{s \cdot x} \Ket{x} diff --git a/source/know/concept/blochs-theorem/index.md b/source/know/concept/blochs-theorem/index.md index 6f445f1..d7fcf90 100644 --- a/source/know/concept/blochs-theorem/index.md +++ b/source/know/concept/blochs-theorem/index.md @@ -17,85 +17,72 @@ take the following form, where the function $$u(\vb{r})$$ is periodic on the same lattice, i.e. $$u(\vb{r}) = u(\vb{r} + \vb{a})$$: -$$ -\begin{aligned} +$$\begin{aligned} \boxed{ \psi(\vb{r}) = u(\vb{r}) e^{i \vb{k} \cdot \vb{r}} } -\end{aligned} -$$ +\end{aligned}$$ In other words, in a periodic potential, the solutions are simply plane waves with a periodic modulation, known as **Bloch functions** or **Bloch states**. -This is suprisingly easy to prove: +This is surprisingly easy to prove: if the Hamiltonian $$\hat{H}$$ is lattice-periodic, then both $$\psi(\vb{r})$$ and $$\psi(\vb{r} + \vb{a})$$ are eigenstates with the same energy: -$$ -\begin{aligned} +$$\begin{aligned} \hat{H} \psi(\vb{r}) = E \psi(\vb{r}) \qquad \hat{H} \psi(\vb{r} + \vb{a}) = E \psi(\vb{r} + \vb{a}) -\end{aligned} -$$ +\end{aligned}$$ Now define the unitary translation operator $$\hat{T}(\vb{a})$$ such that $$\psi(\vb{r} + \vb{a}) = \hat{T}(\vb{a}) \psi(\vb{r})$$. From the previous equation, we then know that: -$$ -\begin{aligned} +$$\begin{aligned} \hat{H} \hat{T}(\vb{a}) \psi(\vb{r}) = E \hat{T}(\vb{a}) \psi(\vb{r}) = \hat{T}(\vb{a}) \big(E \psi(\vb{r})\big) = \hat{T}(\vb{a}) \hat{H} \psi(\vb{r}) -\end{aligned} -$$ +\end{aligned}$$ In other words, if $$\hat{H}$$ is lattice-periodic, then it will commute with $$\hat{T}(\vb{a})$$, i.e. $$[\hat{H}, \hat{T}(\vb{a})] = 0$$. Consequently, $$\hat{H}$$ and $$\hat{T}(\vb{a})$$ must share eigenstates $$\psi(\vb{r})$$: -$$ -\begin{aligned} +$$\begin{aligned} \hat{H} \:\psi(\vb{r}) = E \:\psi(\vb{r}) \qquad \qquad \hat{T}(\vb{a}) \:\psi(\vb{r}) = \tau \:\psi(\vb{r}) -\end{aligned} -$$ +\end{aligned}$$ Since $$\hat{T}$$ is unitary, its eigenvalues $$\tau$$ must have the form $$e^{i \theta}$$, with $$\theta$$ real. Therefore a translation by $$\vb{a}$$ causes a phase shift, for some vector $$\vb{k}$$: -$$ -\begin{aligned} +$$\begin{aligned} \psi(\vb{r} + \vb{a}) = \hat{T}(\vb{a}) \:\psi(\vb{r}) = e^{i \theta} \:\psi(\vb{r}) = e^{i \vb{k} \cdot \vb{a}} \:\psi(\vb{r}) -\end{aligned} -$$ +\end{aligned}$$ Let us now define the following function, keeping our arbitrary choice of $$\vb{k}$$: -$$ -\begin{aligned} +$$\begin{aligned} u(\vb{r}) - = e^{- i \vb{k} \cdot \vb{r}} \:\psi(\vb{r}) -\end{aligned} -$$ + \equiv e^{- i \vb{k} \cdot \vb{r}} \:\psi(\vb{r}) +\end{aligned}$$ As it turns out, this function is guaranteed to be lattice-periodic for any $$\vb{k}$$: -$$ -\begin{aligned} +$$\begin{aligned} u(\vb{r} + \vb{a}) &= e^{- i \vb{k} \cdot (\vb{r} + \vb{a})} \:\psi(\vb{r} + \vb{a}) \\ @@ -104,8 +91,7 @@ $$ &= e^{- i \vb{k} \cdot \vb{r}} \:\psi(\vb{r}) \\ &= u(\vb{r}) -\end{aligned} -$$ +\end{aligned}$$ Then Bloch's theorem follows from isolating the definition of $$u(\vb{r})$$ for $$\psi(\vb{r})$$. diff --git a/source/know/concept/boltzmann-relation/index.md b/source/know/concept/boltzmann-relation/index.md index b528adf..b3634f3 100644 --- a/source/know/concept/boltzmann-relation/index.md +++ b/source/know/concept/boltzmann-relation/index.md @@ -8,15 +8,16 @@ categories: layout: "concept" --- -In a plasma where the ions and electrons are both in thermal equilibrium, -and in the absence of short-lived induced electromagnetic fields, -their densities $$n_i$$ and $$n_e$$ can be predicted. +In a plasma where the ions and electrons are in thermal equilibrium, +in the absence of short-lived induced electromagnetic fields, +the densities $$n_i$$ and $$n_e$$ can be predicted. -By definition, a particle in an [electric field](/know/concept/electric-field/) $$\vb{E}$$ +By definition, a charged particle in +an [electric field](/know/concept/electric-field/) $$\vb{E} = - \nabla \phi$$ experiences a [Lorentz force](/know/concept/lorentz-force/) $$\vb{F}_e$$. This corresponds to a force density $$\vb{f}_e$$, such that $$\vb{F}_e = \vb{f}_e \dd{V}$$. -For the electrons, we thus have: +For electrons: $$\begin{aligned} \vb{f}_e @@ -74,10 +75,9 @@ $$\begin{aligned} But due to their large mass, ions respond much slower to fluctuations in the above equilibrium. Consequently, after a perturbation, -the ions spend more time in a transient non-equilibrium state +the ions spend more time in a non-equilibrium state than the electrons, so this formula for $$n_i$$ is only valid -if the perturbation is sufficiently slow, -such that the ions can keep up. +if the perturbation is sufficiently slow, such that the ions can keep up. Usually, electrons do not suffer the same issue, thanks to their small mass and hence fast response. diff --git a/source/know/concept/bose-einstein-distribution/index.md b/source/know/concept/bose-einstein-distribution/index.md index e420d7c..5640e69 100644 --- a/source/know/concept/bose-einstein-distribution/index.md +++ b/source/know/concept/bose-einstein-distribution/index.md @@ -11,21 +11,22 @@ layout: "concept" **Bose-Einstein statistics** describe how bosons, which do not obey the [Pauli exclusion principle](/know/concept/pauli-exclusion-principle/), -will distribute themselves across the available states +distribute themselves across the available states in a system at equilibrium. Consider a single-particle state $$s$$, which can contain any number of bosons. Since the occupation number $$N$$ is variable, -we turn to the [grand canonical ensemble](/know/concept/grand-canonical-ensemble/), -whose grand partition function $$\mathcal{Z}$$ is as follows, +we use the [grand canonical ensemble](/know/concept/grand-canonical-ensemble/), +whose grand partition function $$\mathcal{Z}$$ is as shown below, where $$\varepsilon$$ is the energy per particle, -and $$\mu$$ is the chemical potential: +and $$\mu$$ is the chemical potential. +We evaluate the sum in $$\mathcal{Z}$$ as a geometric series: $$\begin{aligned} \mathcal{Z} - = \sum_{N = 0}^\infty \Big( \exp(- \beta (\varepsilon - \mu)) \Big)^{N} - = \frac{1}{1 - \exp(- \beta (\varepsilon - \mu))} + = \sum_{N = 0}^\infty \Big( e^{-\beta (\varepsilon - \mu)} \Big)^{N} + = \frac{1}{1 - e^{-\beta (\varepsilon - \mu)}} \end{aligned}$$ The corresponding [thermodynamic potential](/know/concept/thermodynamic-potential/) @@ -34,41 +35,45 @@ is the Landau potential $$\Omega$$, given by: $$\begin{aligned} \Omega = - k T \ln{\mathcal{Z}} - = k T \ln\!\Big( 1 - \exp(- \beta (\varepsilon - \mu)) \Big) + = k T \ln\!\big( 1 - e^{-\beta (\varepsilon - \mu)} \big) \end{aligned}$$ -The average number of particles $$\Expval{N}$$ -is found by taking a derivative of $$\Omega$$: +The average number of particles $$\expval{N}$$ in $$s$$ +is then found by taking a derivative of $$\Omega$$: $$\begin{aligned} - \Expval{N} + \expval{N} = - \pdv{\Omega}{\mu} = k T \pdv{\ln{\mathcal{Z}}}{\mu} - = \frac{\exp(- \beta (\varepsilon - \mu))}{1 - \exp(- \beta (\varepsilon - \mu))} + = \frac{e^{-\beta (\varepsilon - \mu)}}{1 - e^{-\beta (\varepsilon - \mu)}} \end{aligned}$$ -By multitplying both the numerator and the denominator by $$\exp(\beta(\varepsilon \!-\! \mu))$$, +By multiplying both the numerator and the denominator by $$e^{\beta(\varepsilon \!-\! \mu)}$$, we arrive at the standard form of the **Bose-Einstein distribution** $$f_B$$: $$\begin{aligned} \boxed{ - \Expval{N} + \expval{N} = f_B(\varepsilon) - = \frac{1}{\exp(\beta (\varepsilon - \mu)) - 1} + = \frac{1}{e^{\beta (\varepsilon - \mu)} - 1} } \end{aligned}$$ -This tells the expected occupation number $$\Expval{N}$$ of state $$s$$, +This gives the expected occupation number $$\expval{N}$$ +of state $$s$$ with energy $$\varepsilon$$, given a temperature $$T$$ and chemical potential $$\mu$$. -The corresponding variance $$\sigma^2$$ of $$N$$ is found to be: + +{% comment %} +The corresponding variance $$\sigma^2 \equiv \expval{N^2} - \expval{N}^2$$ is found to be: $$\begin{aligned} \boxed{ \sigma^2 - = k T \pdv{\Expval{N}}{\mu} - = \Expval{N} \big(1 + \Expval{N}\big) + = k T \pdv{\expval{N}}{\mu} + = \expval{N} \big(1 + \expval{N}\!\big) } \end{aligned}$$ +{% endcomment %} diff --git a/source/know/concept/fermi-dirac-distribution/index.md b/source/know/concept/fermi-dirac-distribution/index.md index 09a3e76..2a38eb3 100644 --- a/source/know/concept/fermi-dirac-distribution/index.md +++ b/source/know/concept/fermi-dirac-distribution/index.md @@ -11,67 +11,68 @@ layout: "concept" **Fermi-Dirac statistics** describe how identical **fermions**, which obey the [Pauli exclusion principle](/know/concept/pauli-exclusion-principle/), -will distribute themselves across the available states in a system at equilibrium. +distribute themselves across the available states in a system at equilibrium. Consider one single-particle state $$s$$, which can contain $$0$$ or $$1$$ fermions. Because the occupation number $$N$$ is variable, we turn to the [grand canonical ensemble](/know/concept/grand-canonical-ensemble/), whose grand partition function $$\mathcal{Z}$$ is as follows, -where we sum over all microstates of $$s$$: +where $$\varepsilon$$ is the energy of $$s$$ +and $$\mu$$ is the chemical potential: $$\begin{aligned} \mathcal{Z} - = \sum_{N = 0}^1 \exp(- \beta N (\varepsilon - \mu)) - = 1 + \exp(- \beta (\varepsilon - \mu)) + = \sum_{N = 0}^1 \Big( e^{-\beta (\varepsilon - \mu)} \Big)^N + = 1 + e^{-\beta (\varepsilon - \mu)} \end{aligned}$$ -Where $$\mu$$ is the chemical potential, -and $$\varepsilon$$ is the energy contribution per particle in $$s$$, -i.e. the total energy of all particles $$E = \varepsilon N$$. - The corresponding [thermodynamic potential](/know/concept/thermodynamic-potential/) is the Landau potential $$\Omega$$, given by: $$\begin{aligned} \Omega = - k T \ln{\mathcal{Z}} - = - k T \ln\!\Big( 1 + \exp(- \beta (\varepsilon - \mu)) \Big) + = - k T \ln\!\Big( 1 + e^{-\beta (\varepsilon - \mu)} \Big) \end{aligned}$$ -The average number of particles $$\Expval{N}$$ -in state $$s$$ is then found to be as follows: +The average number of particles $$\expval{N}$$ +in $$s$$ is then found by taking a derivative of $$\Omega$$: $$\begin{aligned} - \Expval{N} + \expval{N} = - \pdv{\Omega}{\mu} = k T \pdv{\ln{\mathcal{Z}}}{\mu} - = \frac{\exp(- \beta (\varepsilon - \mu))}{1 + \exp(- \beta (\varepsilon - \mu))} + = \frac{e^{-\beta (\varepsilon - \mu)}}{1 + e^{-\beta (\varepsilon - \mu)}} \end{aligned}$$ -By multiplying both the numerator and the denominator by $$\exp(\beta (\varepsilon \!-\! \mu))$$, +By multiplying both the numerator and the denominator by $$e^{\beta (\varepsilon \!-\! \mu)}$$, we arrive at the standard form of the **Fermi-Dirac distribution** or **Fermi function** $$f_F$$: $$\begin{aligned} \boxed{ - \Expval{N} + \expval{N} = f_F(\varepsilon) - = \frac{1}{\exp(\beta (\varepsilon - \mu)) + 1} + = \frac{1}{e^{\beta (\varepsilon - \mu)} + 1} } \end{aligned}$$ -This tells the expected occupation number $$\Expval{N}$$ of state $$s$$, +This gives the expected occupation number $$\expval{N}$$ +of state $$s$$ with energy $$\varepsilon$$, given a temperature $$T$$ and chemical potential $$\mu$$. -The corresponding variance $$\sigma^2$$ of $$N$$ is found to be: + +{% comment %} +The corresponding variance $$\sigma^2 \equiv \expval{N^2} - \expval{N}^2$$ is found to be: $$\begin{aligned} \boxed{ \sigma^2 - = k T \pdv{\Expval{N}}{\mu} - = \Expval{N} \big(1 - \Expval{N}\big) + = k T \pdv{\expval{N}}{\mu} + = \expval{N} \big(1 - \expval{N}\big) } \end{aligned}$$ +{% endcomment %} diff --git a/source/know/concept/hagen-poiseuille-equation/index.md b/source/know/concept/hagen-poiseuille-equation/index.md index 6484631..52d3ce8 100644 --- a/source/know/concept/hagen-poiseuille-equation/index.md +++ b/source/know/concept/hagen-poiseuille-equation/index.md @@ -11,9 +11,8 @@ layout: "concept" The **Hagen-Poiseuille equation**, or simply the **Poiseuille equation**, describes the flow of a fluid with nonzero [viscosity](/know/concept/viscosity/) -through a cylindrical pipe. -Due to its viscosity, the fluid clings to the sides, -limiting the amount that can pass through, for a pipe with radius $$R$$. +through a cylindrical pipe: the fluid clings to the sides, +limiting the amount that can pass through per unit time. Consider the [Navier-Stokes equations](/know/concept/navier-stokes-equations/) of an incompressible fluid with spatially uniform density $$\rho$$. @@ -27,13 +26,12 @@ $$\begin{aligned} \nabla \cdot \va{v} = 0 \end{aligned}$$ -Into this, we insert the ansatz $$\va{v} = \vu{e}_z \: v_z(r)$$, -where $$\vu{e}_z$$ is the $$z$$-axis' unit vector. -In other words, we assume that the flow velocity depends only on $$r$$; -not on $$\phi$$ or $$z$$. -Plugging this into the Navier-Stokes equations, -$$\nabla \cdot \va{v}$$ is trivially zero, -and in the other equation we multiply out $$\rho$$, yielding this, +Let the pipe have radius $$R$$, and be infinitely long and parallel to the $$z$$-axis. +We insert the ansatz $$\va{v} = \vu{e}_z \: v_z(r)$$, +where $$\vu{e}_z$$ is the $$z$$-axis' unit vector, +and we are assuming that the flow depends only on $$r$$, not on $$\phi$$ or $$z$$. +With this, $$\nabla \cdot \va{v}$$ trivially vanishes, +and in the main equation multiplying out $$\rho$$ yields this, where $$\eta = \rho \nu$$ is the dynamic viscosity: $$\begin{aligned} @@ -56,7 +54,7 @@ $$\begin{aligned} = - G \end{aligned}$$ -The former equation, for $$p(z)$$, is easy to solve. +The former equation for $$p(z)$$ is easy to solve. We get an integration constant $$p(0)$$: $$\begin{aligned} @@ -64,13 +62,12 @@ $$\begin{aligned} = p(0) - G z \end{aligned}$$ -This gives meaning to the **pressure gradient** $$G$$: -for a pipe of length $$L$$, -it describes the pressure difference $$\Delta p = p(0) - p(L)$$ -that is driving the fluid, -i.e. $$G = \Delta p / L$$ +This gives meaning to $$G$$: it is the **pressure gradient**, +which for a pipe of length $$L$$ +describes the pressure difference $$\Delta p = p(0) - p(L)$$ +that is driving the fluid, i.e. $$G = \Delta p / L$$. -As for the latter equation, for $$v_z(r)$$, +As for the latter equation for $$v_z(r)$$, we start by integrating it once, introducing a constant $$A$$: $$\begin{aligned} @@ -148,8 +145,8 @@ $$\begin{aligned} = \pi R^2 L G \end{aligned}$$ -We would like to get rid of $$G$$ for being impractical, -so we substitute $$R^2 G = 8 \eta \Expval{v_z}$$, yielding: +$$G$$ is an inconvenient quantity here, so we remove it +by substituting $$R^2 G = 8 \eta \Expval{v_z}$$: $$\begin{aligned} \boxed{ @@ -159,8 +156,8 @@ $$\begin{aligned} \end{aligned}$$ Due to this drag, the pressure difference $$\Delta p = p(0) - p(L)$$ -does work on the fluid, at a rate $$P$$, -since power equals force (i.e. pressure times area) times velocity: +does work on the fluid at a rate $$P$$. +Since power equals force (i.e. pressure times area) times velocity: $$\begin{aligned} P @@ -179,14 +176,14 @@ $$\begin{aligned} = D \Expval{v_z} \end{aligned}$$ -In conclusion, the power $$P$$, -needed to drive a fluid through the pipe at a rate $$Q$$, -is given by: +In conclusion, the power $$P$$ needed to drive a fluid +through the pipe at a rate $$Q$$ is given by: $$\begin{aligned} \boxed{ P = 8 \pi \eta L \Expval{v_z}^2 + = \frac{8 \eta L}{\pi R^4} Q^2 } \end{aligned}$$ diff --git a/source/know/concept/impulse-response/index.md b/source/know/concept/impulse-response/index.md index 661ed3f..8210f5c 100644 --- a/source/know/concept/impulse-response/index.md +++ b/source/know/concept/impulse-response/index.md @@ -8,68 +8,75 @@ categories: layout: "concept" --- -The **impulse response** $$u_p(t)$$ of a system whose behaviour is described -by a linear operator $$\hat{L}$$, is defined as the reponse of the system +Given a system whose behaviour is described by a linear operator $$\hat{L}$$, +its **impulse response** $$u_\delta(t)$$ is defined as the system's response when forced by the [Dirac delta function](/know/concept/dirac-delta-function/) $$\delta(t)$$: $$\begin{aligned} \boxed{ - \hat{L} \{ u_p(t) \} = \delta(t) + \hat{L} \{ u_\delta(t) \} + = \delta(t) } \end{aligned}$$ -This can be used to find the response $$u(t)$$ of $$\hat{L}$$ to -*any* forcing function $$f(t)$$, i.e. not only $$\delta(t)$$, -by simply taking the convolution with $$u_p(t)$$: +This can be used to find the response $$u(t)$$ of $$\hat{L}$$ +to *any* forcing function $$f(t)$$, +by simply taking the convolution with $$u_\delta(t)$$: $$\begin{aligned} - \hat{L} \{ u(t) \} = f(t) + \hat{L} \{ u(t) \} + = f(t) \quad \implies \quad \boxed{ - u(t) = (f * u_p)(t) + u(t) + = (f * u_\delta)(t) } \end{aligned}$$ {% include proof/start.html id="proof-theorem" -%} -Starting from the definition of $$u_p(t)$$, +Starting from the definition of $$u_\delta(t)$$, we shift the argument by some constant $$\tau$$, -and multiply both sides by the constant $$f(\tau)$$: +and multiply both sides by $$f(\tau)$$: $$\begin{aligned} - \hat{L} \{ u_p(t - \tau) \} &= \delta(t - \tau) + \hat{L} \{ u_\delta(t - \tau) \} + &= \delta(t - \tau) \\ - \hat{L} \{ f(\tau) \: u_p(t - \tau) \} &= f(\tau) \: \delta(t - \tau) + \hat{L} \{ f(\tau) \: u_\delta(t - \tau) \} + &= f(\tau) \: \delta(t - \tau) \end{aligned}$$ -Where $$f(\tau)$$ can be moved inside using the -linearity of $$\hat{L}$$. Integrating over $$\tau$$ then gives us: +Where $$f(\tau)$$ was moved inside thanks to the linearity of $$\hat{L}$$. +Integrating over $$\tau$$ gives us: $$\begin{aligned} - \int_0^\infty \hat{L} \{ f(\tau) \: u_p(t - \tau) \} \dd{\tau} + \int_0^\infty \hat{L} \{ f(\tau) \: u_\delta(t - \tau) \} \dd{\tau} &= \int_0^\infty f(\tau) \: \delta(t - \tau) \dd{\tau} = f(t) \end{aligned}$$ -The integral and $$\hat{L}$$ are operators of different variables, so we reorder them: +The integral and $$\hat{L}$$ are operators of different variables, so we reorder them, +and recognize that the resulting integral is a convolution: $$\begin{aligned} - \hat{L} \int_0^\infty f(\tau) \: u_p(t - \tau) \dd{\tau} - &= (f * u_p)(t) = \hat{L}\{ u(t) \} = f(t) + f(t) + &= \hat{L} \int_0^\infty f(\tau) \: u_\delta(t - \tau) \dd{\tau} + = \hat{L} \Big\{ (f * u_\delta)(t) \Big\} \end{aligned}$$ + +Because $$\hat{L} \{ u(t) \} = f(t)$$ by definition, +we then see that $$(f * u_\delta)(t) = u(t)$$. {% include proof/end.html id="proof-theorem" %} This is useful for solving initial value problems, -because any initial condition can be satisfied -due to the linearity of $$\hat{L}$$, -by choosing the initial values of the homogeneous solution $$\hat{L}\{ u_h(t) \} = 0$$ -such that the total solution $$(f * u_p)(t) + u_h(t)$$ -has the desired values. - -Meanwhile, for boundary value problems, -the related [fundamental solution](/know/concept/fundamental-solution/) -is preferable. +because any initial condition can be satisfied thanks to linearity, +by choosing the initial values of the homogeneous solution $$\hat{L}\{ u_0(t) \} = 0$$ +such that the total solution $$(f * u_\delta)(t) + u_0(t)$$ has the desired values. + +For boundary value problems, there is the related concept of +a [fundamental solution](/know/concept/fundamental-solution/). diff --git a/source/know/concept/lagrange-multiplier/index.md b/source/know/concept/lagrange-multiplier/index.md index 9fb61a8..6b5e3fc 100644 --- a/source/know/concept/lagrange-multiplier/index.md +++ b/source/know/concept/lagrange-multiplier/index.md @@ -14,18 +14,18 @@ a function $$f$$ subject to **equality constraints**. For example, in 2D, the goal is to maximize/minimize $$f(x, y)$$ while satisfying $$g(x, y) = 0$$. We assume that $$f$$ and $$g$$ are both continuous -and have continuous first derivatives, -and that their domain is all of $$\mathbb{R}$$. +and have continuous first derivatives +on all of $$\mathbb{R}^2$$. -Side note: many authors write that Lagrange multipliers +Note: many authors write that Lagrange multipliers can be used for constraints of the form $$g(x, y) = c$$ for a constant $$c$$. -However, this method technically requires $$c = 0$$. -This issue is easy to solve: given $$g = c$$, +Actually, the method requires $$c = 0$$, +but this issue is easy to solve: given $$g = c$$, simply define $$\tilde{g} \equiv g - c = 0$$ and use that as constraint instead. -Before introducing $$g$$, -optimizing $$f$$ comes down to finding its stationary points: +So, we want to optimize $$f$$. +If we ignore $$g$$, that just means finding its stationary points: $$\begin{aligned} 0 @@ -36,20 +36,18 @@ $$\begin{aligned} This problem is easy: the two dimensions can be handled independently, so all we need to do is find the roots of the partial derivatives. -However, adding $$g$$ makes the problem much more complicated: +However, a constraint $$g = 0$$ makes the problem much more complicated: points with $$\nabla f = 0$$ might not satisfy $$g = 0$$, and points where $$g = 0$$ might not have $$\nabla f = 0$$. The dimensions also cannot be handled independently anymore, -since they are implicitly related by $$g$$. +since they are implicitly related via $$g$$. Imagine a contour plot of $$g(x, y)$$. The trick is this: if we follow a contour of $$g = 0$$, the highest and lowest values of $$f$$ along the way are the desired local extrema. -Recall our assumption that $$\nabla f$$ is continuous: -hence *along our contour* $$f$$ is slowly-varying -in the close vicinity of each such point, -and stationary at the point itself. +At each such extremum, $$f$$ must be stationary from the contour's point of view, +and slowly-varying in its close vicinity since $$\nabla f$$ is continuous. We thus have two categories of extrema: 1. $$\nabla f = 0$$ there, @@ -57,7 +55,7 @@ We thus have two categories of extrema: In other words, a stationary point of $$f$$ coincidentally lies on a contour of $$g = 0$$. -2. The contours of $$f$$ and $$g$$ are parallel around the point. +2. The contours of $$f$$ and $$g$$ are parallel at the point. By definition, $$f$$ is stationary along each of its contours, so when we find that $$f$$ is stationary at a point on our $$g = 0$$ path, it means we touched a contour of $$f$$. @@ -83,7 +81,7 @@ $$\begin{aligned} Where $$\lambda$$ is the **Lagrange multiplier** that quantifies the difference in magnitude between the gradients. By setting $$\lambda = 0$$, this equation also handles the 1st category $$\nabla f = 0$$. -Some authors define $$\lambda$$ with the opposite sign. +Note that some authors define $$\lambda$$ with the opposite sign. The method of Lagrange multipliers uses these facts to rewrite a constrained $$N$$-dimensional optimization problem @@ -97,8 +95,7 @@ $$\begin{aligned} } \end{aligned}$$ -Let us do an unconstrained optimization of $$\mathcal{L}$$ as usual, -by demanding it is stationary: +Look what happens when we do an unconstrained optimization of $$\mathcal{L}$$ in the usual way: $$\begin{aligned} 0 @@ -110,14 +107,11 @@ $$\begin{aligned} The last item in this vector represents $$g = 0$$, and the others $$\nabla f = -\lambda \nabla g$$ as discussed earlier. -To solve this equation, -we assign $$\lambda$$ a value that agrees with it -(such a value exists for each local extremum -according to our above discussion of the two categories), -and then find the locations $$(x, y)$$ that satisfy it. -However, as usual for optimization problems, +When this unconstrained problem is solved using standard methods, +the resulting solutions also satisfy the constrained problem. +However, as usual in the field of optimization, this method only finds *local* extrema *and* saddle points; -it is a necessary condition for optimality, but not sufficient. +it represents a necessary condition for optimality, but not a sufficient one. We often assign $$\lambda$$ an algebraic expression rather than a value, usually without even bothering to calculate its final actual value. -- cgit v1.2.3