summaryrefslogtreecommitdiff
path: root/source/know/concept/ritz-method/index.md
diff options
context:
space:
mode:
authorPrefetch2022-10-20 18:25:31 +0200
committerPrefetch2022-10-20 18:25:31 +0200
commit16555851b6514a736c5c9d8e73de7da7fc9b6288 (patch)
tree76b8bfd30f8941d0d85365990bcdbc5d0643cabc /source/know/concept/ritz-method/index.md
parente5b9bce79b68a68ddd2e51daa16d2fea73b84fdb (diff)
Migrate from 'jekyll-katex' to 'kramdown-math-sskatex'
Diffstat (limited to 'source/know/concept/ritz-method/index.md')
-rw-r--r--source/know/concept/ritz-method/index.md138
1 files changed, 69 insertions, 69 deletions
diff --git a/source/know/concept/ritz-method/index.md b/source/know/concept/ritz-method/index.md
index ff1faae..902b7cf 100644
--- a/source/know/concept/ritz-method/index.md
+++ b/source/know/concept/ritz-method/index.md
@@ -28,10 +28,10 @@ $$\begin{aligned}
= \frac{1}{S} \int_a^b p(x) \big|u_x(x)\big|^2 - q(x) \big|u(x)\big|^2 \dd{x}
\end{aligned}$$
-Where $u(x) \in \mathbb{C}$ is the unknown function,
-and $p(x), q(x) \in \mathbb{R}$ are given.
-In addition, $S$ is the norm of $u$, which we demand be constant
-with respect to a weight function $w(x) \in \mathbb{R}$:
+Where $$u(x) \in \mathbb{C}$$ is the unknown function,
+and $$p(x), q(x) \in \mathbb{R}$$ are given.
+In addition, $$S$$ is the norm of $$u$$, which we demand be constant
+with respect to a weight function $$w(x) \in \mathbb{R}$$:
$$\begin{aligned}
S
@@ -39,8 +39,8 @@ $$\begin{aligned}
\end{aligned}$$
To handle this normalization requirement,
-we introduce a [Lagrange multiplier](/know/concept/lagrange-multiplier/) $\lambda$,
-and define the Lagrangian $\Lambda$ for the full constrained optimization problem as:
+we introduce a [Lagrange multiplier](/know/concept/lagrange-multiplier/) $$\lambda$$,
+and define the Lagrangian $$\Lambda$$ for the full constrained optimization problem as:
$$\begin{aligned}
\Lambda
@@ -64,13 +64,13 @@ $$\begin{aligned}
\end{aligned}$$
This has the familiar form of a [Sturm-Liouville problem](/know/concept/sturm-liouville-theory/) (SLP),
-with $\lambda$ representing an eigenvalue.
+with $$\lambda$$ representing an eigenvalue.
SLPs have useful properties, but before we can take advantage of those,
-we need to handle an important detail: the boundary conditions (BCs) on $u$.
+we need to handle an important detail: the boundary conditions (BCs) on $$u$$.
The above equation is only a valid SLP for certain BCs,
as seen in the derivation of Sturm-Liouville theory.
-Let us return to the definition of $R[u]$,
+Let us return to the definition of $$R[u]$$,
and integrate it by parts:
$$\begin{aligned}
@@ -81,7 +81,7 @@ $$\begin{aligned}
\end{aligned}$$
The boundary term vanishes for a subset of the BCs that make a valid SLP,
-including Dirichlet BCs $u(a) = u(b) = 0$, Neumann BCs $u_x(a) = u_x(b) = 0$, and periodic BCs.
+including Dirichlet BCs $$u(a) = u(b) = 0$$, Neumann BCs $$u_x(a) = u_x(b) = 0$$, and periodic BCs.
Therefore, we assume that this term does indeed vanish,
such that we can use Sturm-Liouville theory later:
@@ -91,14 +91,14 @@ $$\begin{aligned}
\equiv - \frac{1}{S} \int_a^b u^* \hat{H} u \dd{x}
\end{aligned}$$
-Where $\hat{H}$ is the self-adjoint Sturm-Liouville operator.
+Where $$\hat{H}$$ is the self-adjoint Sturm-Liouville operator.
Because the constrained Euler-Lagrange equation is now an SLP,
-we know that it has an infinite number of real discrete eigenvalues $\lambda_n$ with a lower bound,
-corresponding to mutually orthogonal eigenfunctions $u_n(x)$.
+we know that it has an infinite number of real discrete eigenvalues $$\lambda_n$$ with a lower bound,
+corresponding to mutually orthogonal eigenfunctions $$u_n(x)$$.
To understand the significance of this result,
suppose we have solved the SLP,
-and now insert one of the eigenfunctions $u_n$ into $R$:
+and now insert one of the eigenfunctions $$u_n$$ into $$R$$:
$$\begin{aligned}
R[u_n]
@@ -109,9 +109,9 @@ $$\begin{aligned}
= \frac{S_n}{S_n} \lambda_n
\end{aligned}$$
-Where $S_n$ is the normalization of $u_n$.
-In other words, when given $u_n$,
-the functional $R$ yields the corresponding eigenvalue $\lambda_n$:
+Where $$S_n$$ is the normalization of $$u_n$$.
+In other words, when given $$u_n$$,
+the functional $$R$$ yields the corresponding eigenvalue $$\lambda_n$$:
$$\begin{aligned}
\boxed{
@@ -120,15 +120,15 @@ $$\begin{aligned}
}
\end{aligned}$$
-This powerful result was not at all clear from $R$'s initial definition.
+This powerful result was not at all clear from $$R$$'s initial definition.
## Justification
-But what if we do not know the eigenfunctions? Is $R$ still useful?
-Yes, as we shall see. Suppose we make an educated guess $u(x)$
-for the ground state (i.e. lowest-eigenvalue) solution $u_0(x)$:
+But what if we do not know the eigenfunctions? Is $$R$$ still useful?
+Yes, as we shall see. Suppose we make an educated guess $$u(x)$$
+for the ground state (i.e. lowest-eigenvalue) solution $$u_0(x)$$:
$$\begin{aligned}
u(x)
@@ -136,10 +136,10 @@ $$\begin{aligned}
\end{aligned}$$
Here, we are using the fact that the eigenfunctions of an SLP form a complete set,
-so our (known) guess $u$ can be expanded in the true (unknown) eigenfunctions $u_n$.
-We are assuming that $u$ is already quite close to its target $u_0$,
-such that the (unknown) expansion coefficients $c_n$ are small;
-specifically $|c_n|^2 \ll 1$.
+so our (known) guess $$u$$ can be expanded in the true (unknown) eigenfunctions $$u_n$$.
+We are assuming that $$u$$ is already quite close to its target $$u_0$$,
+such that the (unknown) expansion coefficients $$c_n$$ are small;
+specifically $$|c_n|^2 \ll 1$$.
Let us start from what we know:
$$\begin{aligned}
@@ -150,8 +150,8 @@ $$\begin{aligned}
\end{aligned}$$
This quantity is known as the **Rayleigh quotient**.
-Inserting our ansatz $u$,
-and using that the true $u_n$ have corresponding eigenvalues $\lambda_n$:
+Inserting our ansatz $$u$$,
+and using that the true $$u_n$$ have corresponding eigenvalues $$\lambda_n$$:
$$\begin{aligned}
R[u]
@@ -176,8 +176,8 @@ $$\begin{aligned}
+ \sum_{n = 1}^\infty c_n \Inprod{u_0}{w u_n} + \sum_{m n} c_n c_m^* \Inprod{u_m}{w u_n}}
\end{aligned}$$
-Using orthogonality $\Inprod{u_m}{w u_n} = S_n \delta_{mn}$,
-and the fact that $n \neq 0$ by definition, we find:
+Using orthogonality $$\Inprod{u_m}{w u_n} = S_n \delta_{mn}$$,
+and the fact that $$n \neq 0$$ by definition, we find:
$$\begin{aligned}
R
@@ -191,7 +191,7 @@ $$\begin{aligned}
{\displaystyle S_0 + \sum_{n} |c_n|^2 S_n}
\end{aligned}$$
-It is always possible to choose our normalizations such that $S_n = S$ for all $u_n$, leaving:
+It is always possible to choose our normalizations such that $$S_n = S$$ for all $$u_n$$, leaving:
$$\begin{aligned}
R
@@ -211,9 +211,9 @@ $$\begin{aligned}
{\displaystyle 1 + \sum_{n} |c_n|^2}
\end{aligned}$$
-Thus, if we improve our guess $u$,
-then $R[u]$ approaches the true eigenvalue $\lambda_0$.
-For numerically finding $u_0$ and $\lambda_0$, this gives us a clear goal: minimize $R$, because:
+Thus, if we improve our guess $$u$$,
+then $$R[u]$$ approaches the true eigenvalue $$\lambda_0$$.
+For numerically finding $$u_0$$ and $$\lambda_0$$, this gives us a clear goal: minimize $$R$$, because:
$$\begin{aligned}
\boxed{
@@ -228,19 +228,19 @@ In the context of quantum mechanics, this is not surprising,
since any superposition of multiple states
is guaranteed to have a higher energy than the ground state.
-Note that the convergence to $\lambda_0$ goes as $|c_n|^2$,
-while $u$ converges to $u_0$ as $|c_n|$ by definition,
-so even a fairly bad guess $u$ will give a decent estimate for $\lambda_0$.
+Note that the convergence to $$\lambda_0$$ goes as $$|c_n|^2$$,
+while $$u$$ converges to $$u_0$$ as $$|c_n|$$ by definition,
+so even a fairly bad guess $$u$$ will give a decent estimate for $$\lambda_0$$.
## The method
In the following, we stick to Dirac notation,
-since the results hold for both continuous functions $u(x)$ and discrete vectors $\vb{u}$,
-as long as the operator $\hat{H}$ is self-adjoint.
-Suppose we express our guess $\Ket{u}$ as a linear combination
-of *known* basis vectors $\Ket{f_n}$ with weights $a_n \in \mathbb{C}$:
+since the results hold for both continuous functions $$u(x)$$ and discrete vectors $$\vb{u}$$,
+as long as the operator $$\hat{H}$$ is self-adjoint.
+Suppose we express our guess $$\Ket{u}$$ as a linear combination
+of *known* basis vectors $$\Ket{f_n}$$ with weights $$a_n \in \mathbb{C}$$:
$$\begin{aligned}
\Ket{u}
@@ -249,16 +249,16 @@ $$\begin{aligned}
\approx \sum_{n = 0}^{N - 1} a_n \Ket{f_n}
\end{aligned}$$
-For numerical tractability, we truncate the sum at $N$ terms,
-and for generality, we allow $\Ket{f_n}$ to be non-orthogonal,
-as described by an *overlap matrix* with elements $S_{mn}$:
+For numerical tractability, we truncate the sum at $$N$$ terms,
+and for generality, we allow $$\Ket{f_n}$$ to be non-orthogonal,
+as described by an *overlap matrix* with elements $$S_{mn}$$:
$$\begin{aligned}
\Inprod{f_m}{w f_n} = S_{m n}
\end{aligned}$$
From the discussion above,
-we know that the ground-state eigenvalue $\lambda_0$ is estimated by:
+we know that the ground-state eigenvalue $$\lambda_0$$ is estimated by:
$$\begin{aligned}
\lambda_0
@@ -269,8 +269,8 @@ $$\begin{aligned}
\equiv \frac{\displaystyle \sum_{m n} a_m^* a_n H_{m n}}{\displaystyle \sum_{m n} a_m^* a_n S_{mn}}
\end{aligned}$$
-And we also know that our goal is to minimize $R[u]$,
-so we vary $a_k^*$ to find its extremum:
+And we also know that our goal is to minimize $$R[u]$$,
+so we vary $$a_k^*$$ to find its extremum:
$$\begin{aligned}
0
@@ -283,7 +283,7 @@ $$\begin{aligned}
= \frac{\displaystyle \sum_{n} a_n \big(H_{k n} - \lambda S_{k n}\big)}{\Inprod{u}{w u}}
\end{aligned}$$
-Clearly, this is only satisfied if the following holds for all $k = 0, 1, ..., N\!-\!1$:
+Clearly, this is only satisfied if the following holds for all $$k = 0, 1, ..., N\!-\!1$$:
$$\begin{aligned}
0
@@ -292,7 +292,7 @@ $$\begin{aligned}
For illustrative purposes,
we can write this as a matrix equation
-with $M_{k n} \equiv H_{k n} - \lambda S_{k n}$:
+with $$M_{k n} \equiv H_{k n} - \lambda S_{k n}$$:
$$\begin{aligned}
\begin{bmatrix}
@@ -311,10 +311,10 @@ $$\begin{aligned}
\end{bmatrix}
\end{aligned}$$
-Note that this looks like an eigenvalue problem for $\lambda$.
-Indeed, demanding that $\overline{M}$ cannot simply be inverted
+Note that this looks like an eigenvalue problem for $$\lambda$$.
+Indeed, demanding that $$\overline{M}$$ cannot simply be inverted
(i.e. the solution is non-trivial)
-yields a characteristic polynomial for $\lambda$:
+yields a characteristic polynomial for $$\lambda$$:
$$\begin{aligned}
0
@@ -322,39 +322,39 @@ $$\begin{aligned}
= \det\!\Big[ \overline{H} - \lambda \overline{S} \Big]
\end{aligned}$$
-This gives a set of $\lambda$,
-which are the exact eigenvalues of $\overline{H}$,
-and the estimated eigenvalues of $\hat{H}$
-(recall that $\overline{H}$ is $\hat{H}$ expressed in a truncated basis).
-The eigenvector $\big[ a_0, a_1, ..., a_{N-1} \big]$ of the lowest $\lambda$
-gives the optimal weights to approximate $\Ket{u_0}$ in the basis $\{\Ket{f_n}\}$.
-Likewise, the higher $\lambda$'s eigenvectors approximate
-excited (i.e. non-ground) eigenstates of $\hat{H}$,
+This gives a set of $$\lambda$$,
+which are the exact eigenvalues of $$\overline{H}$$,
+and the estimated eigenvalues of $$\hat{H}$$
+(recall that $$\overline{H}$$ is $$\hat{H}$$ expressed in a truncated basis).
+The eigenvector $$\big[ a_0, a_1, ..., a_{N-1} \big]$$ of the lowest $$\lambda$$
+gives the optimal weights to approximate $$\Ket{u_0}$$ in the basis $$\{\Ket{f_n}\}$$.
+Likewise, the higher $$\lambda$$'s eigenvectors approximate
+excited (i.e. non-ground) eigenstates of $$\hat{H}$$,
although in practice the results are less accurate the higher we go.
The overall accuracy is determined by how good our truncated basis is,
i.e. how large a subspace it spans
-of the [Hilbert space](/know/concept/hilbert-space/) in which the true $\Ket{u_0}$ resides.
+of the [Hilbert space](/know/concept/hilbert-space/) in which the true $$\Ket{u_0}$$ resides.
Clearly, adding more basis vectors will improve the results,
at the cost of computation.
-For example, if $\hat{H}$ represents a helium atom,
-a good choice for $\{\Ket{f_n}\}$ would be hydrogen orbitals,
+For example, if $$\hat{H}$$ represents a helium atom,
+a good choice for $$\{\Ket{f_n}\}$$ would be hydrogen orbitals,
since those are qualitatively similar.
You may find this result unsurprising;
-it makes some intuitive sense that approximating $\hat{H}$
-in a limited basis would yield a matrix $\overline{H}$ giving rough eigenvalues.
+it makes some intuitive sense that approximating $$\hat{H}$$
+in a limited basis would yield a matrix $$\overline{H}$$ giving rough eigenvalues.
The point of this discussion is to rigorously show
the validity of this approach.
If we only care about the ground state,
-then we already know $\lambda$ from $R[u]$,
-so all we need to do is solve the above matrix equation for $a_n$.
-Keep in mind that $\overline{M}$ is singular,
-and $a_n$ are only defined up to a constant factor.
+then we already know $$\lambda$$ from $$R[u]$$,
+so all we need to do is solve the above matrix equation for $$a_n$$.
+Keep in mind that $$\overline{M}$$ is singular,
+and $$a_n$$ are only defined up to a constant factor.
Nowadays, there exist many other methods to calculate eigenvalues
-of complicated operators $\hat{H}$,
+of complicated operators $$\hat{H}$$,
but an attractive feature of the Ritz method is that it is single-step,
whereas its competitors tend to be iterative.
That said, the Ritz method cannot recover from a poorly chosen basis.