Categories: Mathematics, Physics.

Sturm-Liouville theory

Sturm-Liouville theory extends the concept of Hermitian matrix eigenvalue problems to linear second-order ordinary differential equations.

It states that, given suitable boundary conditions, any such equation can be rewritten using the Sturm-Liouville operator, and that the corresponding eigenvalue problem, known as a Sturm-Liouville problem, will give real eigenvalues and a complete set of eigenfunctions.

General operator

Consider the most general form of a second-order linear differential operator L^\hat{L}, where p0(x)p_0(x), p1(x)p_1(x), and p2(x)p_2(x) are real functions of x[a,b]x \in [a,b] and are nonzero for all x]a,b[x \in \,\,]a, b[:

L^{u(x)}p2(x)u(x)+p1(x)u(x)+p0(x)u(x)\begin{aligned} \hat{L} \{u(x)\} \equiv p_2(x) \: u''(x) + p_1(x) \: u'(x) + p_0(x) \: u(x) \end{aligned}

Analogously to matrices, we now define its adjoint operator L^\hat{L}^\dagger as follows:

L^fgfL^g\begin{aligned} \inprod{\hat{L}^\dagger f}{g} \equiv \inprod{f}{\hat{L} g} \end{aligned}

What is L^\hat{L}^\dagger, given the above definition of L^\hat{L}? We start from the inner product fL^g\inprod{f}{\hat{L} g}:

fL^g=abf(x)L^{g(x)}dx=ab(fp2)g+(fp1)g+(fp0)gdx=[(fp2)g+(fp1)g]abab(fp2)g+(fp1)g(fp0)gdx=[f(p2g+p1g)(fp2)g]ab+ab ⁣((fp2)(fp1)+(fp0))gdx=[f(p2g+(p1p2)g)(f)p2g]ab+ab(L^{f})gdx\begin{aligned} \inprod{f}{\hat{L} g} &= \int_a^b f^*(x) \hat{L}\{g(x)\} \dd{x} = \int_a^b (f^* p_2) g'' + (f^* p_1) g' + (f^* p_0) g \dd{x} \\ &= \Big[ (f^* p_2) g' + (f^* p_1) g \Big]_a^b - \int_a^b (f^* p_2)' g' + (f^* p_1)' g - (f^* p_0) g \dd{x} \\ &= \Big[ f^* (p_2 g' + p_1 g) - (f^* p_2)' g \Big]_a^b + \int_a^b \! \Big( (f p_2)'' - (f p_1)' + (f p_0) \Big)^* g \dd{x} \\ &= \Big[ f^* \big( p_2 g' + (p_1 - p_2') g \big) - (f^*)' p_2 g \Big]_a^b + \int_a^b \Big( \hat{L}^\dagger\{f\} \Big)^* g \dd{x} \end{aligned}

The newly-formed operator on ff must be L^\hat{L}^\dagger, but there is an additional boundary term. To fix this, we demand that p1(x)=p2(x)p_1(x) = p_2'(x) and that [p2(fg(f)g)]ab=0\big[ p_2 (f^* g' - (f^*)' g) \big]_a^b = 0, leaving:

fL^g=[f(p2g+(p1p2)g)(f)p2g]ab+L^fg=[p2(fg(f)g)]ab+L^fg=L^fg\begin{aligned} \inprod{f}{\hat{L} g} &= \Big[ f^* \big( p_2 g' + (p_1 - p_2') g \big) - (f^*)' p_2 g \Big]_a^b + \inprod{\hat{L}^\dagger f}{g} \\ &= \Big[ p_2 \big( f^* g' - (f^*)' g \big) \Big]_a^b + \inprod{\hat{L}^\dagger f}{g} \\ &= \inprod{\hat{L}^\dagger f}{g} \end{aligned}

Let us look at the expression for L^\hat{L}^\dagger we just found, with the restriction p1=p2p_1 = p_2' in mind:

L^{f}=(p2f)(p1f)+(p0f)=(p2f+2p2f+p2f)(p1f+p1f)+(p0f)=p2f+(2p2p1)f+(p2p1+p0)f=p2f+p1f+p0f=L^{f}\begin{aligned} \hat{L}^\dagger \{f\} &= (p_2 f)'' - (p_1 f)' + (p_0 f) \\ &= (p_2'' f + 2 p_2' f' + p_2 f'') - (p_1' f + p_1 f') + (p_0 f) \\ &= p_2 f'' + (2 p_2' - p_1) f' + (p_2'' - p_1' + p_0) f \\ &= p_2 f'' + p_1 f' + p_0 f \\ &= \hat{L}\{f\} \end{aligned}

So L^\hat{L} is self-adjoint, i.e. L^\hat{L}^\dagger is the same as L^\hat{L}! Indeed, every such second-order linear operator is self-adjoint if it satisfies the constraints p1=p2p_1 = p_2' and [p2(fg(f)g)]ab=0\big[ p_2 (f^* g' - (f^*)' g) \big]_a^b = 0.

But what if p1p2p_1 \neq p_2'? Let us multiply L^\hat{L} by an unknown p(x)0p(x) \neq 0 and divide by p2(x)0p_2(x) \neq 0:

pp2L^{u}=pu+pp1p2u+pp0p2u\begin{aligned} \frac{p}{p_2} \hat{L} \{u\} = p u'' + p \frac{p_1}{p_2} u' + p \frac{p_0}{p_2} u \end{aligned}

We now demand that the derivative p(x)p'(x) of the unknown p(x)p(x) satisfies:

p(x)=p(x)p1(x)p2(x)    p1(x)p2(x)dx=1p(x)dp\begin{aligned} p'(x) = p(x) \frac{p_1(x)}{p_2(x)} \quad \implies \quad \frac{p_1(x)}{p_2(x)} \dd{x} = \frac{1}{p(x)} \dd{p} \end{aligned}

Taking the indefinite integral of this differential equation yields an expression for p(x)p(x):

p1(x)p2(x)dx=1pdp=ln ⁣(p(x))    p(x)=exp ⁣(p1(x)p2(x)dx)\begin{aligned} \int \frac{p_1(x)}{p_2(x)} \dd{x} = \int \frac{1}{p} \dd{p} = \ln\!\big( p(x) \big) \quad \implies \quad \boxed{ p(x) = \exp\!\bigg( \int \frac{p_1(x)}{p_2(x)} \dd{x} \bigg) } \end{aligned}

We define an additional function q(x)q(x) based on the last term of (p/p2)L^(p / p_2) \hat{L} shown above:

q(x)p(x)p0(x)p2(x)=p0(x)p2(x)exp ⁣(p1(x)p2(x)dx)\begin{aligned} \boxed{ q(x) \equiv p(x) \frac{p_0(x)}{p_2(x)} } = \frac{p_0(x)}{p_2(x)} \exp\!\bigg( \int \frac{p_1(x)}{p_2(x)} \dd{x} \bigg) \end{aligned}

When rewritten using pp and qq, the modified operator (p/p2)L^(p / p_2) \hat{L} looks like this:

pp2L^{u}=pu+pu+qu=(pu)+qu\begin{aligned} \frac{p}{p_2} \hat{L} \{u\} = p u'' + p' u' + q u = (p u')' + q u \end{aligned}

This is the self-adjoint form from earlier! So even if p1p2p_1 \neq p_2', any second-order linear operator with p2(x)0p_2(x) \neq 0 can easily be made self-adjoint. The resulting general form is called the Sturm-Liouville operator L^SL\hat{L}_\mathrm{SL}, for nonzero p(x)p(x):

L^SL{u(x)}=(p(x)u(x))+q(x)u(x)=L^SL{u(x)}\begin{aligned} \boxed{ \begin{aligned} \hat{L}_\mathrm{SL} \{u(x)\} &= \Big( p(x) \: u'(x) \Big)' + q(x) \: u(x) \\ &= \hat{L}_\mathrm{SL}^\dagger \{u(x)\} \end{aligned} } \end{aligned}

Still subject to the constraint [p(fg(f)g)]ab=0\big[ p (f^* g' - (f^*)' g) \big]_a^b = 0 such that fL^SLg=L^SLfg\inprod{f}{\hat{L}_\mathrm{SL} g} = \inprod{\hat{L}_\mathrm{SL}^\dagger f}{g}.

Eigenvalue problem

An eigenvalue problem of L^SL\hat{L}_\mathrm{SL} is called a Sturm-Liouville problem (SLP). The goal is to find the eigenvalues λ\lambda and corresponding eigenfunctions u(x)u(x) that fulfill:

L^SL{u(x)}=λw(x)u(x)\begin{aligned} \boxed{ \hat{L}_\mathrm{SL}\{u(x)\} = - \lambda \: w(x) \: u(x) } \end{aligned}

Where w(x)w(x) is a real weight function satisfying w(x)>0w(x) > 0 for x]a,b[x \in \,\,]a, b[. By convention, the trivial solution u=0u = 0 is not valid. Some authors have the opposite sign for λ\lambda and/or ww.

In our derivation of L^SL\hat{L}_\mathrm{SL} above, we imposed the constraint [p(fg(f)g)]ab=0\big[ p (f^* g' - (f')^* g) \big]_a^b = 0 to ensure that L^SLfg=fL^SLg\inprod{\hat{L}_\mathrm{SL}^\dagger f}{g} = \inprod{f}{\hat{L}_\mathrm{SL} g}. Consequently, to have a valid SLP, the boundary conditions (BCs) on uu must be such that, for any two (possibly identical) eigenfunctions umu_m and unu_n, we have:

[p(x)(um(x)un(x)(um(x))un(x))]ab=0\begin{aligned} \Big[ p(x) \big( u_m^*(x) \: u_n'(x) - \big(u_m'(x)\big)^* u_n(x) \big) \Big]_a^b = 0 \end{aligned}

There are many boundary conditions that satisfy this requirement. Some notable ones are listed non-exhaustively below. Verify for yourself that these work:

If this is fulfilled, Sturm-Liouville theory gives us useful information about λ\lambda and uu. By definition, the following must be satisfied for two arbitrary eigenfunctions umu_m and unu_n:

0=L^SL{um}+λmwum=L^SL{un}+λnwun\begin{aligned} 0 &= \hat{L}_\mathrm{SL}\{u_m^*\} + \lambda_m^* w u_m^* \\ &= \hat{L}_\mathrm{SL}\{u_n\} + \lambda_n w u_n \end{aligned}

We multiply each by the other eigenfunction, subtract the results, and integrate:

0=abum(L^SL{un}+λnwun)un(L^SL{um}+λmwum)dx=abumL^SL{un}unL^SL{um}+(λnλm)umwundx=umL^SLunL^SLumun+(λnλm)umwun\begin{aligned} 0 &= \int_a^b u_m^* \big(\hat{L}_\mathrm{SL}\{u_n\} + \lambda_n w u_n\big) - u_n \big(\hat{L}_\mathrm{SL}\{u_m^*\} + \lambda_m^* w u_m^*\big) \dd{x} \\ &= \int_a^b u_m^* \hat{L}_\mathrm{SL}\{u_n\} - u_n \hat{L}_\mathrm{SL}\{u_m^*\} + (\lambda_n - \lambda_m^*) u_m^* w u_n \dd{x} \\ &= \inprod{u_m}{\hat{L}_\mathrm{SL} u_n} - \inprod{\hat{L}_\mathrm{SL} u_m}{u_n} + (\lambda_n - \lambda_m^*) \inprod{u_m}{w u_n} \end{aligned}

The operator L^SL\hat{L}_\mathrm{SL} is self-adjoint of course, so the first two terms vanish, leaving us with:

0=(λnλm)umwun\begin{aligned} 0 &= (\lambda_n - \lambda_m^*) \inprod{u_m}{w u_n} \end{aligned}

When m=nm = n, we get unwun>0\inprod{u_n}{w u_n} > 0, so the equation is only satisfied if λn=λn\lambda_n^* = \lambda_n, meaning the eigenvalue λn\lambda_n is real for any nn. When mnm \neq n, then λnλm\lambda_n - \lambda_m^* may or may not be zero depending on the degeneracy. If there is no degeneracy, then λnλm0\lambda_n - \lambda_m^* \neq 0, meaning umwun=0\inprod{u_m}{w u_n} = 0, i.e. the eigenfunctions are orthogonal. In case of degeneracy, manual orthogonalization is needed, which is guaranteed to be doable using the Gram-Schmidt method.

In conclusion, an SLP has real eigenvalues and orthogonal eigenfunctions: for all mm, nn:

λnRumwun=Anδnm\begin{aligned} \boxed{ \lambda_n \in \mathbb{R} } \qquad\qquad \boxed{ \inprod{u_m}{w u_n} = A_n \delta_{nm} } \end{aligned}

When solving a differential eigenvalue problem, knowing that all eigenvalues are real is a huge simplification, so it is always worth checking whether you are dealing with an SLP.

Another useful fact: it turns out that SLPs always have an infinite number of discrete eigenvalues. Furthermore, there always exists a lowest eigenvalue λ0>\lambda_0 > -\infty, called the ground state.

Complete basis

Not only are an SLP’s eigenfunctions orthogonal, they also form a complete basis, meaning any well-behaved f(x)f(x) can be expanded as a generalized Fourier series with coefficients ana_n:

f(x)=n=0anun(x)forx]a,b[\begin{aligned} \boxed{ f(x) = \sum_{n = 0}^\infty a_n u_n(x) \quad \mathrm{for} \: x \in \,\,]a, b[ } \end{aligned}

This series converges faster if ff satisfies the same BCs as unu_n; in that case the expansion is also valid for the inclusive interval x[a,b]x \in [a, b].

To find an expression for the coefficients ana_n, we multiply the above generalized Fourier series by umwu_m^* w and integrate it to get inner products on both sides:

umwf=n=0anumwunabumwfdx=ab(n=0anumwun)dxumwf=n=0anumwun\begin{aligned} u_m^* w f &= \sum_{n = 0}^\infty a_n u_m^* w u_n \\ \int_a^b u_m^* w f \dd{x} &= \int_a^b \bigg( \sum_{n = 0}^\infty a_n u_m^* w u_n \bigg) \dd{x} \\ \inprod{u_m}{w f} &= \sum_{n = 0}^\infty a_n \inprod{u_m}{w u_n} \end{aligned}

Because the eigenfunctions of an SLP are mutually orthogonal, the summation disappears:

umwf=n=0anumwun=n=0anAnδnm=amAm\begin{aligned} \inprod{u_m}{w f} &= \sum_{n = 0}^\infty a_n \inprod{u_m}{w u_n} = \sum_{n = 0}^\infty a_n A_n \delta_{nm} = a_m A_m \end{aligned}

After isolating this for ama_m, we see that the coefficients are given by the projection of the target function ff onto the normalized eigenfunctions um/Amu_m / A_m:

an=unwfAn=unwfunwun\begin{aligned} \boxed{ a_n = \frac{\inprod{u_n}{w f}}{A_n} = \frac{\inprod{u_n}{w f}}{\inprod{u_n}{w u_n}} } \end{aligned}

As a final remark, we can see something interesting by rearranging the generalized Fourier series after inserting the expression for ana_n:

f(x)=n=01Anunwfun(x)=ab(n=01Anun(ξ)w(ξ)f(ξ)un(x))dξ=abf(ξ)(n=01Anun(ξ)w(ξ)un(x))dξ\begin{aligned} f(x) &= \sum_{n = 0}^\infty \frac{1}{A_n} \inprod{u_n}{w f} u_n(x) \\ &= \int_a^b \bigg(\sum_{n = 0}^\infty \frac{1}{A_n} u_n^*(\xi) \: w(\xi) \: f(\xi) \: u_n(x) \bigg) \dd{\xi} \\ &= \int_a^b f(\xi) \bigg(\sum_{n = 0}^\infty \frac{1}{A_n} u_n^*(\xi) \: w(\xi) \: u_n(x) \bigg) \dd{\xi} \end{aligned}

Upon closer inspection, the parenthesized summation must be the Dirac delta function δ(x)\delta(x) for the integral to work out. In fact, this is the underlying requirement for completeness:

n=01Anun(ξ)w(ξ)un(x)=δ(xξ)\begin{aligned} \boxed{ \sum_{n = 0}^\infty \frac{1}{A_n} u_n^*(\xi) \: w(\xi) \: u_n(x) = \delta(x - \xi) } \end{aligned}


  1. O. Bang, Applied mathematics for physicists: lecture notes, 2019, unpublished.