Categories: Mathematics, Physics.

# Sturm-Liouville theory

Sturm-Liouville theory extends the concept of Hermitian matrix eigenvalue problems to linear second-order ordinary differential equations.

It states that, given suitable boundary conditions, any such equation can be rewritten using the Sturm-Liouville operator, and that the corresponding eigenvalue problem, known as a Sturm-Liouville problem, will give real eigenvalues and a complete set of eigenfunctions.

## General operator

Consider the most general form of a second-order linear differential operator $\hat{L}$, where $p_0(x)$, $p_1(x)$, and $p_2(x)$ are real functions of $x \in [a,b]$ and are nonzero for all $x \in \,\,]a, b[$:

\begin{aligned} \hat{L} \{u(x)\} \equiv p_2(x) \: u''(x) + p_1(x) \: u'(x) + p_0(x) \: u(x) \end{aligned}

Analogously to matrices, we now define its adjoint operator $\hat{L}^\dagger$ as follows:

\begin{aligned} \inprod{\hat{L}^\dagger f}{g} \equiv \inprod{f}{\hat{L} g} \end{aligned}

What is $\hat{L}^\dagger$, given the above definition of $\hat{L}$? We start from the inner product $\inprod{f}{\hat{L} g}$:

\begin{aligned} \inprod{f}{\hat{L} g} &= \int_a^b f^*(x) \hat{L}\{g(x)\} \dd{x} = \int_a^b (f^* p_2) g'' + (f^* p_1) g' + (f^* p_0) g \dd{x} \\ &= \Big[ (f^* p_2) g' + (f^* p_1) g \Big]_a^b - \int_a^b (f^* p_2)' g' + (f^* p_1)' g - (f^* p_0) g \dd{x} \\ &= \Big[ f^* (p_2 g' + p_1 g) - (f^* p_2)' g \Big]_a^b + \int_a^b \! \Big( (f p_2)'' - (f p_1)' + (f p_0) \Big)^* g \dd{x} \\ &= \Big[ f^* \big( p_2 g' + (p_1 - p_2') g \big) - (f^*)' p_2 g \Big]_a^b + \int_a^b \Big( \hat{L}^\dagger\{f\} \Big)^* g \dd{x} \end{aligned}

The newly-formed operator on $f$ must be $\hat{L}^\dagger$, but there is an additional boundary term. To fix this, we demand that $p_1(x) = p_2'(x)$ and that $\big[ p_2 (f^* g' - (f^*)' g) \big]_a^b = 0$, leaving:

\begin{aligned} \inprod{f}{\hat{L} g} &= \Big[ f^* \big( p_2 g' + (p_1 - p_2') g \big) - (f^*)' p_2 g \Big]_a^b + \inprod{\hat{L}^\dagger f}{g} \\ &= \Big[ p_2 \big( f^* g' - (f^*)' g \big) \Big]_a^b + \inprod{\hat{L}^\dagger f}{g} \\ &= \inprod{\hat{L}^\dagger f}{g} \end{aligned}

Let us look at the expression for $\hat{L}^\dagger$ we just found, with the restriction $p_1 = p_2'$ in mind:

\begin{aligned} \hat{L}^\dagger \{f\} &= (p_2 f)'' - (p_1 f)' + (p_0 f) \\ &= (p_2'' f + 2 p_2' f' + p_2 f'') - (p_1' f + p_1 f') + (p_0 f) \\ &= p_2 f'' + (2 p_2' - p_1) f' + (p_2'' - p_1' + p_0) f \\ &= p_2 f'' + p_1 f' + p_0 f \\ &= \hat{L}\{f\} \end{aligned}

So $\hat{L}$ is self-adjoint, i.e. $\hat{L}^\dagger$ is the same as $\hat{L}$! Indeed, every such second-order linear operator is self-adjoint if it satisfies the constraints $p_1 = p_2'$ and $\big[ p_2 (f^* g' - (f^*)' g) \big]_a^b = 0$.

But what if $p_1 \neq p_2'$? Let us multiply $\hat{L}$ by an unknown $p(x) \neq 0$ and divide by $p_2(x) \neq 0$:

\begin{aligned} \frac{p}{p_2} \hat{L} \{u\} = p u'' + p \frac{p_1}{p_2} u' + p \frac{p_0}{p_2} u \end{aligned}

We now demand that the derivative $p'(x)$ of the unknown $p(x)$ satisfies:

\begin{aligned} p'(x) = p(x) \frac{p_1(x)}{p_2(x)} \quad \implies \quad \frac{p_1(x)}{p_2(x)} \dd{x} = \frac{1}{p(x)} \dd{p} \end{aligned}

Taking the indefinite integral of this differential equation yields an expression for $p(x)$:

\begin{aligned} \int \frac{p_1(x)}{p_2(x)} \dd{x} = \int \frac{1}{p} \dd{p} = \ln\!\big( p(x) \big) \quad \implies \quad \boxed{ p(x) = \exp\!\bigg( \int \frac{p_1(x)}{p_2(x)} \dd{x} \bigg) } \end{aligned}

We define an additional function $q(x)$ based on the last term of $(p / p_2) \hat{L}$ shown above:

\begin{aligned} \boxed{ q(x) \equiv p(x) \frac{p_0(x)}{p_2(x)} } = \frac{p_0(x)}{p_2(x)} \exp\!\bigg( \int \frac{p_1(x)}{p_2(x)} \dd{x} \bigg) \end{aligned}

When rewritten using $p$ and $q$, the modified operator $(p / p_2) \hat{L}$ looks like this:

\begin{aligned} \frac{p}{p_2} \hat{L} \{u\} = p u'' + p' u' + q u = (p u')' + q u \end{aligned}

This is the self-adjoint form from earlier! So even if $p_1 \neq p_2'$, any second-order linear operator with $p_2(x) \neq 0$ can easily be made self-adjoint. The resulting general form is called the Sturm-Liouville operator $\hat{L}_\mathrm{SL}$, for nonzero $p(x)$:

\begin{aligned} \boxed{ \begin{aligned} \hat{L}_\mathrm{SL} \{u(x)\} &= \Big( p(x) \: u'(x) \Big)' + q(x) \: u(x) \\ &= \hat{L}_\mathrm{SL}^\dagger \{u(x)\} \end{aligned} } \end{aligned}

Still subject to the constraint $\big[ p (f^* g' - (f^*)' g) \big]_a^b = 0$ such that $\inprod{f}{\hat{L}_\mathrm{SL} g} = \inprod{\hat{L}_\mathrm{SL}^\dagger f}{g}$.

## Eigenvalue problem

An eigenvalue problem of $\hat{L}_\mathrm{SL}$ is called a Sturm-Liouville problem (SLP). The goal is to find the eigenvalues $\lambda$ and corresponding eigenfunctions $u(x)$ that fulfill:

\begin{aligned} \boxed{ \hat{L}_\mathrm{SL}\{u(x)\} = - \lambda \: w(x) \: u(x) } \end{aligned}

Where $w(x)$ is a real weight function satisfying $w(x) > 0$ for $x \in \,\,]a, b[$. By convention, the trivial solution $u = 0$ is not valid. Some authors have the opposite sign for $\lambda$ and/or $w$.

In our derivation of $\hat{L}_\mathrm{SL}$ above, we imposed the constraint $\big[ p (f^* g' - (f')^* g) \big]_a^b = 0$ to ensure that $\inprod{\hat{L}_\mathrm{SL}^\dagger f}{g} = \inprod{f}{\hat{L}_\mathrm{SL} g}$. Consequently, to have a valid SLP, the boundary conditions (BCs) on $u$ must be such that, for any two (possibly identical) eigenfunctions $u_m$ and $u_n$, we have:

\begin{aligned} \Big[ p(x) \big( u_m^*(x) \: u_n'(x) - \big(u_m'(x)\big)^* u_n(x) \big) \Big]_a^b = 0 \end{aligned}

There are many boundary conditions that satisfy this requirement. Some notable ones are listed non-exhaustively below. Verify for yourself that these work:

• Dirichlet BCs: $u(a) = u(b) = 0$
• Neumann BCs: $u'(a) = u'(b) = 0$
• Robin BCs: $\alpha_1 u(a) + \beta_1 u'(a) = \alpha_2 u(b) + \beta_2 u'(b) = 0$ with $\alpha_{1,2}, \beta_{1,2} \in \mathbb{R}$
• Periodic BCs: $p(a) = p(b)$, $u(a) = u(b)$, and $u'(a) = u'(b)$
• Legendre “BCs”: $p(a) = p(b) = 0$

If this is fulfilled, Sturm-Liouville theory gives us useful information about $\lambda$ and $u$. By definition, the following must be satisfied for two arbitrary eigenfunctions $u_m$ and $u_n$:

\begin{aligned} 0 &= \hat{L}_\mathrm{SL}\{u_m^*\} + \lambda_m^* w u_m^* \\ &= \hat{L}_\mathrm{SL}\{u_n\} + \lambda_n w u_n \end{aligned}

We multiply each by the other eigenfunction, subtract the results, and integrate:

\begin{aligned} 0 &= \int_a^b u_m^* \big(\hat{L}_\mathrm{SL}\{u_n\} + \lambda_n w u_n\big) - u_n \big(\hat{L}_\mathrm{SL}\{u_m^*\} + \lambda_m^* w u_m^*\big) \dd{x} \\ &= \int_a^b u_m^* \hat{L}_\mathrm{SL}\{u_n\} - u_n \hat{L}_\mathrm{SL}\{u_m^*\} + (\lambda_n - \lambda_m^*) u_m^* w u_n \dd{x} \\ &= \inprod{u_m}{\hat{L}_\mathrm{SL} u_n} - \inprod{\hat{L}_\mathrm{SL} u_m}{u_n} + (\lambda_n - \lambda_m^*) \inprod{u_m}{w u_n} \end{aligned}

The operator $\hat{L}_\mathrm{SL}$ is self-adjoint of course, so the first two terms vanish, leaving us with:

\begin{aligned} 0 &= (\lambda_n - \lambda_m^*) \inprod{u_m}{w u_n} \end{aligned}

When $m = n$, we get $\inprod{u_n}{w u_n} > 0$, so the equation is only satisfied if $\lambda_n^* = \lambda_n$, meaning the eigenvalue $\lambda_n$ is real for any $n$. When $m \neq n$, then $\lambda_n - \lambda_m^*$ may or may not be zero depending on the degeneracy. If there is no degeneracy, then $\lambda_n - \lambda_m^* \neq 0$, meaning $\inprod{u_m}{w u_n} = 0$, i.e. the eigenfunctions are orthogonal. In case of degeneracy, manual orthogonalization is needed, which is guaranteed to be doable using the Gram-Schmidt method.

In conclusion, an SLP has real eigenvalues and orthogonal eigenfunctions: for all $m$, $n$:

\begin{aligned} \boxed{ \lambda_n \in \mathbb{R} } \qquad\qquad \boxed{ \inprod{u_m}{w u_n} = A_n \delta_{nm} } \end{aligned}

When solving a differential eigenvalue problem, knowing that all eigenvalues are real is a huge simplification, so it is always worth checking whether you are dealing with an SLP.

Another useful fact: it turns out that SLPs always have an infinite number of discrete eigenvalues. Furthermore, there always exists a lowest eigenvalue $\lambda_0 > -\infty$, called the ground state.

## Complete basis

Not only are an SLP’s eigenfunctions orthogonal, they also form a complete basis, meaning any well-behaved $f(x)$ can be expanded as a generalized Fourier series with coefficients $a_n$:

\begin{aligned} \boxed{ f(x) = \sum_{n = 0}^\infty a_n u_n(x) \quad \mathrm{for} \: x \in \,\,]a, b[ } \end{aligned}

This series converges faster if $f$ satisfies the same BCs as $u_n$; in that case the expansion is also valid for the inclusive interval $x \in [a, b]$.

To find an expression for the coefficients $a_n$, we multiply the above generalized Fourier series by $u_m^* w$ and integrate it to get inner products on both sides:

\begin{aligned} u_m^* w f &= \sum_{n = 0}^\infty a_n u_m^* w u_n \\ \int_a^b u_m^* w f \dd{x} &= \int_a^b \bigg( \sum_{n = 0}^\infty a_n u_m^* w u_n \bigg) \dd{x} \\ \inprod{u_m}{w f} &= \sum_{n = 0}^\infty a_n \inprod{u_m}{w u_n} \end{aligned}

Because the eigenfunctions of an SLP are mutually orthogonal, the summation disappears:

\begin{aligned} \inprod{u_m}{w f} &= \sum_{n = 0}^\infty a_n \inprod{u_m}{w u_n} = \sum_{n = 0}^\infty a_n A_n \delta_{nm} = a_m A_m \end{aligned}

After isolating this for $a_m$, we see that the coefficients are given by the projection of the target function $f$ onto the normalized eigenfunctions $u_m / A_m$:

\begin{aligned} \boxed{ a_n = \frac{\inprod{u_n}{w f}}{A_n} = \frac{\inprod{u_n}{w f}}{\inprod{u_n}{w u_n}} } \end{aligned}

As a final remark, we can see something interesting by rearranging the generalized Fourier series after inserting the expression for $a_n$:

\begin{aligned} f(x) &= \sum_{n = 0}^\infty \frac{1}{A_n} \inprod{u_n}{w f} u_n(x) \\ &= \int_a^b \bigg(\sum_{n = 0}^\infty \frac{1}{A_n} u_n^*(\xi) \: w(\xi) \: f(\xi) \: u_n(x) \bigg) \dd{\xi} \\ &= \int_a^b f(\xi) \bigg(\sum_{n = 0}^\infty \frac{1}{A_n} u_n^*(\xi) \: w(\xi) \: u_n(x) \bigg) \dd{\xi} \end{aligned}

Upon closer inspection, the parenthesized summation must be the Dirac delta function $\delta(x)$ for the integral to work out. In fact, this is the underlying requirement for completeness:

\begin{aligned} \boxed{ \sum_{n = 0}^\infty \frac{1}{A_n} u_n^*(\xi) \: w(\xi) \: u_n(x) = \delta(x - \xi) } \end{aligned}
1. O. Bang, Applied mathematics for physicists: lecture notes, 2019, unpublished.