From 16555851b6514a736c5c9d8e73de7da7fc9b6288 Mon Sep 17 00:00:00 2001 From: Prefetch Date: Thu, 20 Oct 2022 18:25:31 +0200 Subject: Migrate from 'jekyll-katex' to 'kramdown-math-sskatex' --- source/know/concept/ritz-method/index.md | 138 +++++++++++++++---------------- 1 file changed, 69 insertions(+), 69 deletions(-) (limited to 'source/know/concept/ritz-method/index.md') diff --git a/source/know/concept/ritz-method/index.md b/source/know/concept/ritz-method/index.md index ff1faae..902b7cf 100644 --- a/source/know/concept/ritz-method/index.md +++ b/source/know/concept/ritz-method/index.md @@ -28,10 +28,10 @@ $$\begin{aligned} = \frac{1}{S} \int_a^b p(x) \big|u_x(x)\big|^2 - q(x) \big|u(x)\big|^2 \dd{x} \end{aligned}$$ -Where $u(x) \in \mathbb{C}$ is the unknown function, -and $p(x), q(x) \in \mathbb{R}$ are given. -In addition, $S$ is the norm of $u$, which we demand be constant -with respect to a weight function $w(x) \in \mathbb{R}$: +Where $$u(x) \in \mathbb{C}$$ is the unknown function, +and $$p(x), q(x) \in \mathbb{R}$$ are given. +In addition, $$S$$ is the norm of $$u$$, which we demand be constant +with respect to a weight function $$w(x) \in \mathbb{R}$$: $$\begin{aligned} S @@ -39,8 +39,8 @@ $$\begin{aligned} \end{aligned}$$ To handle this normalization requirement, -we introduce a [Lagrange multiplier](/know/concept/lagrange-multiplier/) $\lambda$, -and define the Lagrangian $\Lambda$ for the full constrained optimization problem as: +we introduce a [Lagrange multiplier](/know/concept/lagrange-multiplier/) $$\lambda$$, +and define the Lagrangian $$\Lambda$$ for the full constrained optimization problem as: $$\begin{aligned} \Lambda @@ -64,13 +64,13 @@ $$\begin{aligned} \end{aligned}$$ This has the familiar form of a [Sturm-Liouville problem](/know/concept/sturm-liouville-theory/) (SLP), -with $\lambda$ representing an eigenvalue. +with $$\lambda$$ representing an eigenvalue. SLPs have useful properties, but before we can take advantage of those, -we need to handle an important detail: the boundary conditions (BCs) on $u$. +we need to handle an important detail: the boundary conditions (BCs) on $$u$$. The above equation is only a valid SLP for certain BCs, as seen in the derivation of Sturm-Liouville theory. -Let us return to the definition of $R[u]$, +Let us return to the definition of $$R[u]$$, and integrate it by parts: $$\begin{aligned} @@ -81,7 +81,7 @@ $$\begin{aligned} \end{aligned}$$ The boundary term vanishes for a subset of the BCs that make a valid SLP, -including Dirichlet BCs $u(a) = u(b) = 0$, Neumann BCs $u_x(a) = u_x(b) = 0$, and periodic BCs. +including Dirichlet BCs $$u(a) = u(b) = 0$$, Neumann BCs $$u_x(a) = u_x(b) = 0$$, and periodic BCs. Therefore, we assume that this term does indeed vanish, such that we can use Sturm-Liouville theory later: @@ -91,14 +91,14 @@ $$\begin{aligned} \equiv - \frac{1}{S} \int_a^b u^* \hat{H} u \dd{x} \end{aligned}$$ -Where $\hat{H}$ is the self-adjoint Sturm-Liouville operator. +Where $$\hat{H}$$ is the self-adjoint Sturm-Liouville operator. Because the constrained Euler-Lagrange equation is now an SLP, -we know that it has an infinite number of real discrete eigenvalues $\lambda_n$ with a lower bound, -corresponding to mutually orthogonal eigenfunctions $u_n(x)$. +we know that it has an infinite number of real discrete eigenvalues $$\lambda_n$$ with a lower bound, +corresponding to mutually orthogonal eigenfunctions $$u_n(x)$$. To understand the significance of this result, suppose we have solved the SLP, -and now insert one of the eigenfunctions $u_n$ into $R$: +and now insert one of the eigenfunctions $$u_n$$ into $$R$$: $$\begin{aligned} R[u_n] @@ -109,9 +109,9 @@ $$\begin{aligned} = \frac{S_n}{S_n} \lambda_n \end{aligned}$$ -Where $S_n$ is the normalization of $u_n$. -In other words, when given $u_n$, -the functional $R$ yields the corresponding eigenvalue $\lambda_n$: +Where $$S_n$$ is the normalization of $$u_n$$. +In other words, when given $$u_n$$, +the functional $$R$$ yields the corresponding eigenvalue $$\lambda_n$$: $$\begin{aligned} \boxed{ @@ -120,15 +120,15 @@ $$\begin{aligned} } \end{aligned}$$ -This powerful result was not at all clear from $R$'s initial definition. +This powerful result was not at all clear from $$R$$'s initial definition. ## Justification -But what if we do not know the eigenfunctions? Is $R$ still useful? -Yes, as we shall see. Suppose we make an educated guess $u(x)$ -for the ground state (i.e. lowest-eigenvalue) solution $u_0(x)$: +But what if we do not know the eigenfunctions? Is $$R$$ still useful? +Yes, as we shall see. Suppose we make an educated guess $$u(x)$$ +for the ground state (i.e. lowest-eigenvalue) solution $$u_0(x)$$: $$\begin{aligned} u(x) @@ -136,10 +136,10 @@ $$\begin{aligned} \end{aligned}$$ Here, we are using the fact that the eigenfunctions of an SLP form a complete set, -so our (known) guess $u$ can be expanded in the true (unknown) eigenfunctions $u_n$. -We are assuming that $u$ is already quite close to its target $u_0$, -such that the (unknown) expansion coefficients $c_n$ are small; -specifically $|c_n|^2 \ll 1$. +so our (known) guess $$u$$ can be expanded in the true (unknown) eigenfunctions $$u_n$$. +We are assuming that $$u$$ is already quite close to its target $$u_0$$, +such that the (unknown) expansion coefficients $$c_n$$ are small; +specifically $$|c_n|^2 \ll 1$$. Let us start from what we know: $$\begin{aligned} @@ -150,8 +150,8 @@ $$\begin{aligned} \end{aligned}$$ This quantity is known as the **Rayleigh quotient**. -Inserting our ansatz $u$, -and using that the true $u_n$ have corresponding eigenvalues $\lambda_n$: +Inserting our ansatz $$u$$, +and using that the true $$u_n$$ have corresponding eigenvalues $$\lambda_n$$: $$\begin{aligned} R[u] @@ -176,8 +176,8 @@ $$\begin{aligned} + \sum_{n = 1}^\infty c_n \Inprod{u_0}{w u_n} + \sum_{m n} c_n c_m^* \Inprod{u_m}{w u_n}} \end{aligned}$$ -Using orthogonality $\Inprod{u_m}{w u_n} = S_n \delta_{mn}$, -and the fact that $n \neq 0$ by definition, we find: +Using orthogonality $$\Inprod{u_m}{w u_n} = S_n \delta_{mn}$$, +and the fact that $$n \neq 0$$ by definition, we find: $$\begin{aligned} R @@ -191,7 +191,7 @@ $$\begin{aligned} {\displaystyle S_0 + \sum_{n} |c_n|^2 S_n} \end{aligned}$$ -It is always possible to choose our normalizations such that $S_n = S$ for all $u_n$, leaving: +It is always possible to choose our normalizations such that $$S_n = S$$ for all $$u_n$$, leaving: $$\begin{aligned} R @@ -211,9 +211,9 @@ $$\begin{aligned} {\displaystyle 1 + \sum_{n} |c_n|^2} \end{aligned}$$ -Thus, if we improve our guess $u$, -then $R[u]$ approaches the true eigenvalue $\lambda_0$. -For numerically finding $u_0$ and $\lambda_0$, this gives us a clear goal: minimize $R$, because: +Thus, if we improve our guess $$u$$, +then $$R[u]$$ approaches the true eigenvalue $$\lambda_0$$. +For numerically finding $$u_0$$ and $$\lambda_0$$, this gives us a clear goal: minimize $$R$$, because: $$\begin{aligned} \boxed{ @@ -228,19 +228,19 @@ In the context of quantum mechanics, this is not surprising, since any superposition of multiple states is guaranteed to have a higher energy than the ground state. -Note that the convergence to $\lambda_0$ goes as $|c_n|^2$, -while $u$ converges to $u_0$ as $|c_n|$ by definition, -so even a fairly bad guess $u$ will give a decent estimate for $\lambda_0$. +Note that the convergence to $$\lambda_0$$ goes as $$|c_n|^2$$, +while $$u$$ converges to $$u_0$$ as $$|c_n|$$ by definition, +so even a fairly bad guess $$u$$ will give a decent estimate for $$\lambda_0$$. ## The method In the following, we stick to Dirac notation, -since the results hold for both continuous functions $u(x)$ and discrete vectors $\vb{u}$, -as long as the operator $\hat{H}$ is self-adjoint. -Suppose we express our guess $\Ket{u}$ as a linear combination -of *known* basis vectors $\Ket{f_n}$ with weights $a_n \in \mathbb{C}$: +since the results hold for both continuous functions $$u(x)$$ and discrete vectors $$\vb{u}$$, +as long as the operator $$\hat{H}$$ is self-adjoint. +Suppose we express our guess $$\Ket{u}$$ as a linear combination +of *known* basis vectors $$\Ket{f_n}$$ with weights $$a_n \in \mathbb{C}$$: $$\begin{aligned} \Ket{u} @@ -249,16 +249,16 @@ $$\begin{aligned} \approx \sum_{n = 0}^{N - 1} a_n \Ket{f_n} \end{aligned}$$ -For numerical tractability, we truncate the sum at $N$ terms, -and for generality, we allow $\Ket{f_n}$ to be non-orthogonal, -as described by an *overlap matrix* with elements $S_{mn}$: +For numerical tractability, we truncate the sum at $$N$$ terms, +and for generality, we allow $$\Ket{f_n}$$ to be non-orthogonal, +as described by an *overlap matrix* with elements $$S_{mn}$$: $$\begin{aligned} \Inprod{f_m}{w f_n} = S_{m n} \end{aligned}$$ From the discussion above, -we know that the ground-state eigenvalue $\lambda_0$ is estimated by: +we know that the ground-state eigenvalue $$\lambda_0$$ is estimated by: $$\begin{aligned} \lambda_0 @@ -269,8 +269,8 @@ $$\begin{aligned} \equiv \frac{\displaystyle \sum_{m n} a_m^* a_n H_{m n}}{\displaystyle \sum_{m n} a_m^* a_n S_{mn}} \end{aligned}$$ -And we also know that our goal is to minimize $R[u]$, -so we vary $a_k^*$ to find its extremum: +And we also know that our goal is to minimize $$R[u]$$, +so we vary $$a_k^*$$ to find its extremum: $$\begin{aligned} 0 @@ -283,7 +283,7 @@ $$\begin{aligned} = \frac{\displaystyle \sum_{n} a_n \big(H_{k n} - \lambda S_{k n}\big)}{\Inprod{u}{w u}} \end{aligned}$$ -Clearly, this is only satisfied if the following holds for all $k = 0, 1, ..., N\!-\!1$: +Clearly, this is only satisfied if the following holds for all $$k = 0, 1, ..., N\!-\!1$$: $$\begin{aligned} 0 @@ -292,7 +292,7 @@ $$\begin{aligned} For illustrative purposes, we can write this as a matrix equation -with $M_{k n} \equiv H_{k n} - \lambda S_{k n}$: +with $$M_{k n} \equiv H_{k n} - \lambda S_{k n}$$: $$\begin{aligned} \begin{bmatrix} @@ -311,10 +311,10 @@ $$\begin{aligned} \end{bmatrix} \end{aligned}$$ -Note that this looks like an eigenvalue problem for $\lambda$. -Indeed, demanding that $\overline{M}$ cannot simply be inverted +Note that this looks like an eigenvalue problem for $$\lambda$$. +Indeed, demanding that $$\overline{M}$$ cannot simply be inverted (i.e. the solution is non-trivial) -yields a characteristic polynomial for $\lambda$: +yields a characteristic polynomial for $$\lambda$$: $$\begin{aligned} 0 @@ -322,39 +322,39 @@ $$\begin{aligned} = \det\!\Big[ \overline{H} - \lambda \overline{S} \Big] \end{aligned}$$ -This gives a set of $\lambda$, -which are the exact eigenvalues of $\overline{H}$, -and the estimated eigenvalues of $\hat{H}$ -(recall that $\overline{H}$ is $\hat{H}$ expressed in a truncated basis). -The eigenvector $\big[ a_0, a_1, ..., a_{N-1} \big]$ of the lowest $\lambda$ -gives the optimal weights to approximate $\Ket{u_0}$ in the basis $\{\Ket{f_n}\}$. -Likewise, the higher $\lambda$'s eigenvectors approximate -excited (i.e. non-ground) eigenstates of $\hat{H}$, +This gives a set of $$\lambda$$, +which are the exact eigenvalues of $$\overline{H}$$, +and the estimated eigenvalues of $$\hat{H}$$ +(recall that $$\overline{H}$$ is $$\hat{H}$$ expressed in a truncated basis). +The eigenvector $$\big[ a_0, a_1, ..., a_{N-1} \big]$$ of the lowest $$\lambda$$ +gives the optimal weights to approximate $$\Ket{u_0}$$ in the basis $$\{\Ket{f_n}\}$$. +Likewise, the higher $$\lambda$$'s eigenvectors approximate +excited (i.e. non-ground) eigenstates of $$\hat{H}$$, although in practice the results are less accurate the higher we go. The overall accuracy is determined by how good our truncated basis is, i.e. how large a subspace it spans -of the [Hilbert space](/know/concept/hilbert-space/) in which the true $\Ket{u_0}$ resides. +of the [Hilbert space](/know/concept/hilbert-space/) in which the true $$\Ket{u_0}$$ resides. Clearly, adding more basis vectors will improve the results, at the cost of computation. -For example, if $\hat{H}$ represents a helium atom, -a good choice for $\{\Ket{f_n}\}$ would be hydrogen orbitals, +For example, if $$\hat{H}$$ represents a helium atom, +a good choice for $$\{\Ket{f_n}\}$$ would be hydrogen orbitals, since those are qualitatively similar. You may find this result unsurprising; -it makes some intuitive sense that approximating $\hat{H}$ -in a limited basis would yield a matrix $\overline{H}$ giving rough eigenvalues. +it makes some intuitive sense that approximating $$\hat{H}$$ +in a limited basis would yield a matrix $$\overline{H}$$ giving rough eigenvalues. The point of this discussion is to rigorously show the validity of this approach. If we only care about the ground state, -then we already know $\lambda$ from $R[u]$, -so all we need to do is solve the above matrix equation for $a_n$. -Keep in mind that $\overline{M}$ is singular, -and $a_n$ are only defined up to a constant factor. +then we already know $$\lambda$$ from $$R[u]$$, +so all we need to do is solve the above matrix equation for $$a_n$$. +Keep in mind that $$\overline{M}$$ is singular, +and $$a_n$$ are only defined up to a constant factor. Nowadays, there exist many other methods to calculate eigenvalues -of complicated operators $\hat{H}$, +of complicated operators $$\hat{H}$$, but an attractive feature of the Ritz method is that it is single-step, whereas its competitors tend to be iterative. That said, the Ritz method cannot recover from a poorly chosen basis. -- cgit v1.2.3