Categories: Mathematics, Physics.

Calculus of variations

The calculus of variations lays the mathematical groundwork for Lagrangian mechanics.

Consider a functional JJ, mapping a function f(x)f(x) to a scalar value by integrating over the so-called Lagrangian LL, which represents an expression involving xx, ff and the derivative ff':

J[f]=x0x1L(f,f,x)dx\begin{aligned} J[f] = \int_{x_0}^{x_1} L(f, f', x) \dd{x} \end{aligned}

If JJ in some way measures the physical “cost” (e.g. energy) of the path f(x)f(x) taken by a physical system, the principle of least action states that ff will be a minimum of J[f]J[f], so for example the expended energy will be minimized. In practice, various cost metrics may be used, so maxima of J[f]J[f] are also interesting to us.

If f(x,ε ⁣= ⁣0)f(x, \varepsilon\!=\!0) is the optimal route, then a slightly different (and therefore worse) path between the same two points can be expressed using the parameter ε\varepsilon:

f(x,ε)=f(x,0)+εη(x)orδf=εη(x)\begin{aligned} f(x, \varepsilon) = f(x, 0) + \varepsilon \eta(x) \qquad \mathrm{or} \qquad \delta f = \varepsilon \eta(x) \end{aligned}

Where η(x)\eta(x) is an arbitrary differentiable deviation. Since f(x,ε)f(x, \varepsilon) must start and end in the same points as f(x,0)f(x,0), we have the boundary conditions:

η(x0)=η(x1)=0\begin{aligned} \eta(x_0) = \eta(x_1) = 0 \end{aligned}

Given LL, the goal is to find an equation for the optimal path f(x,0)f(x,0). Just like when finding the minimum of a real function, the minimum ff of a functional J[f]J[f] is a stationary point with respect to the deviation weight ε\varepsilon, a condition often written as δJ=0\delta J = 0. In the following, the integration limits have been omitted:

0=δJ=Jεε=0=Lεdx=Lffε+Lffεdx=Lfη+Lfηdx=[Lfη]x0x1+Lfηddx(Lf)ηdx\begin{aligned} 0 &= \delta J = \pdv{J}{\varepsilon} \Big|_{\varepsilon = 0} = \int \pdv{L}{\varepsilon} \dd{x} = \int \pdv{L}{f} \pdv{f}{\varepsilon} + \pdv{L}{f'} \pdv{f'}{\varepsilon} \dd{x} \\ &= \int \pdv{L}{f} \eta + \pdv{L}{f'} \eta' \dd{x} = \Big[ \pdv{L}{f'} \eta \Big]_{x_0}^{x_1} + \int \pdv{L}{f} \eta - \dv{}{x}\Big( \pdv{L}{f'} \Big) \eta \dd{x} \end{aligned}

The boundary term from partial integration vanishes due to the boundary conditions for η(x)\eta(x). We are thus left with:

0=η(Lfddx(Lf))dx\begin{aligned} 0 = \int \eta \bigg( \pdv{L}{f} - \dv{}{x}\Big( \pdv{L}{f'} \Big) \bigg) \dd{x} \end{aligned}

This holds for all η\eta, but η\eta is arbitrary, so in fact only the parenthesized expression matters:

0=Lfddx(Lf)\begin{aligned} \boxed{ 0 = \pdv{L}{f} - \dv{}{x}\Big( \pdv{L}{f'} \Big) } \end{aligned}

This is known as the Euler-Lagrange equation of the Lagrangian LL, and its solutions represent the optimal paths f(x,0)f(x, 0).

Multiple functions

Suppose that the Lagrangian LL depends on multiple independent functions f1,f2,...,fNf_1, f_2, ..., f_N:

J[f1,...,fN]=x0x1L(f1,...,fN,f1,...,fN,x)dx\begin{aligned} J[f_1, ..., f_N] = \int_{x_0}^{x_1} L(f_1, ..., f_N, f_1', ..., f_N', x) \dd{x} \end{aligned}

In this case, every fn(x)f_n(x) has its own deviation ηn(x)\eta_n(x), satisfying ηn(x0)=ηn(x1)=0\eta_n(x_0) = \eta_n(x_1) = 0:

fn(x,ε)=fn(x,0)+εηn(x)\begin{aligned} f_n(x, \varepsilon) = f_n(x, 0) + \varepsilon \eta_n(x) \end{aligned}

The derivation procedure is identical to the case N=1N = 1 from earlier:

0=Jεε=0=Lεdx=n(Lfnfnε+Lfnfnε)dx=n(Lfnηn+Lfnηn)dx=[nLfnηn]x0x1+nηn(Lfnddx(Lfn))dx\begin{aligned} 0 &= \pdv{J}{\varepsilon} \Big|_{\varepsilon = 0} = \int \pdv{L}{\varepsilon} \dd{x} = \int \sum_{n} \Big( \pdv{L}{f_n} \pdv{f_n}{\varepsilon} + \pdv{L}{f_n'} \pdv{f_n'}{\varepsilon} \Big) \dd{x} \\ &= \int \sum_{n} \Big( \pdv{L}{f_n} \eta_n + \pdv{L}{f_n'} \eta_n' \Big) \dd{x} \\ &= \Big[ \sum_{n} \pdv{L}{f_n'} \eta_n \Big]_{x_0}^{x_1} + \int \sum_{n} \eta_n \bigg( \pdv{L}{f_n} - \dv{}{x}\Big( \pdv{L}{f_n'} \Big) \bigg) \dd{x} \end{aligned}

Once again, ηn(x)\eta_n(x) is arbitrary and disappears at the boundaries, so we end up with NN equations of the same form as for a single function:

0=Lf1ddx(Lf1)0=LfNddx(LfN)\begin{aligned} \boxed{ 0 = \pdv{L}{f_1} - \dv{}{x}\Big( \pdv{L}{f_1'} \Big) \quad \cdots \quad 0 = \pdv{L}{f_N} - \dv{}{x}\Big( \pdv{L}{f_N'} \Big) } \end{aligned}

Higher-order derivatives

Suppose that the Lagrangian LL depends on multiple derivatives of f(x)f(x):

J[f]=x0x1L(f,f,f,...,f(N),x)dx\begin{aligned} J[f] = \int_{x_0}^{x_1} L(f, f', f'', ..., f^{(N)}, x) \dd{x} \end{aligned}

Once again, the derivation procedure is the same as before:

0=Jεε=0=Lεdx=Lffε+nLf(n)f(n)εdx=Lfη+nLf(n)η(n)dx\begin{aligned} 0 &= \pdv{J}{\varepsilon} \Big|_{\varepsilon = 0} = \int \pdv{L}{\varepsilon} \dd{x} = \int \pdv{L}{f} \pdv{f}{\varepsilon} + \sum_{n} \pdv{L}{f^{(n)}} \pdv{f^{(n)}}{\varepsilon} \dd{x} \\ &= \int \pdv{L}{f} \eta + \sum_{n} \pdv{L}{f^{(n)}} \eta^{(n)} \dd{x} \end{aligned}

The goal is to turn each η(n)(x)\eta^{(n)}(x) into η(x)\eta(x), so we need to partially integrate the nnth term of the sum nn times. In this case, we will need some additional boundary conditions for η(x)\eta(x):

η(x0)=η(x1)=0η(N1)(x0)=η(N1)(x1)=0\begin{aligned} \eta'(x_0) = \eta'(x_1) = 0 \qquad \cdots \qquad \eta^{(N-1)}(x_0) = \eta^{(N-1)}(x_1) = 0 \end{aligned}

This eliminates the boundary terms from partial integration, leaving:

0=η(Lf+n(1)ndndxn(Lf(n)))dx\begin{aligned} 0 &= \int \eta \bigg( \pdv{L}{f} + \sum_{n} (-1)^n \dvn{n}{}{x}\Big( \pdv{L}{f^{(n)}} \Big) \bigg) \dd{x} \end{aligned}

Once again, because η(x)\eta(x) is arbitrary, the Euler-Lagrange equation becomes:

0=Lf+n(1)ndndxn(Lf(n))\begin{aligned} \boxed{ 0 = \pdv{L}{f} + \sum_{n} (-1)^n \dvn{n}{}{x}\Big( \pdv{L}{f^{(n)}} \Big) } \end{aligned}

Multiple coordinates

Suppose now that ff is a function of multiple variables. For brevity, we only consider two variables xx and yy, but the results generalize effortlessly to larger amounts. The Lagrangian now depends on all the partial derivatives of f(x,y)f(x, y):

J[f]=(x0,y0)(x1,y1)L(f,fx,fy,x,y)dxdy\begin{aligned} J[f] = \iint_{(x_0, y_0)}^{(x_1, y_1)} L(f, f_x, f_y, x, y) \dd{x} \dd{y} \end{aligned}

The arbitrary deviation η\eta is then also a function of multiple variables:

f(x,y;ε)=f(x,y;0)+εη(x,y)\begin{aligned} f(x, y; \varepsilon) = f(x, y; 0) + \varepsilon \eta(x, y) \end{aligned}

The derivation procedure starts in the exact same way as before:

0=Jεε=0=Lεdxdy=Lffε+Lfxfxε+Lfyfyεdxdy=Lfη+Lfxηx+Lfyηydxdy\begin{aligned} 0 &= \pdv{J}{\varepsilon} \Big|_{\varepsilon = 0} = \iint \pdv{L}{\varepsilon} \dd{x} \dd{y} \\ &= \iint \pdv{L}{f} \pdv{f}{\varepsilon} + \pdv{L}{f_x} \pdv{f_x}{\varepsilon} + \pdv{L}{f_y} \pdv{f_y}{\varepsilon} \dd{x} \dd{y} \\ &= \iint \pdv{L}{f} \eta + \pdv{L}{f_x} \eta_x + \pdv{L}{f_y} \eta_y \dd{x} \dd{y} \end{aligned}

We partially integrate for both ηx\eta_x and ηy\eta_y, yielding:

0=[Lfxη]x0x1dy+[Lfyη]y0y1dx+η(Lfddx(Lfx)ddy(Lfy))dxdy\begin{aligned} 0 &= \int \Big[ \pdv{L}{f_x} \eta \Big]_{x_0}^{x_1} \dd{y} + \int \Big[ \pdv{L}{f_y} \eta \Big]_{y_0}^{y_1} \dd{x} \\ &\quad + \iint \eta \bigg( \pdv{L}{f} - \dv{}{x}\Big( \pdv{L}{f_x} \Big) - \dv{}{y}\Big( \pdv{L}{f_y} \Big) \bigg) \dd{x} \dd{y} \end{aligned}

But now, to eliminate these boundary terms, we need extra conditions for η\eta:

y:η(x0,y)=η(x1,y)=0x:η(x,y0)=η(x,y1)=0\begin{aligned} \forall y: \eta(x_0, y) = \eta(x_1, y) = 0 \qquad \forall x: \eta(x, y_0) = \eta(x, y_1) = 0 \end{aligned}

In other words, the deviation η\eta must be zero on the whole “box”. Again relying on the fact that η\eta is arbitrary, the Euler-Lagrange equation is:

0=Lfddx(Lfx)ddy(Lfy)\begin{aligned} 0 = \pdv{L}{f} - \dv{}{x}\Big( \pdv{L}{f_x} \Big) - \dv{}{y}\Big( \pdv{L}{f_y} \Big) \end{aligned}

This generalizes nicely to functions of even more variables x1,x2,...,xNx_1, x_2, ..., x_N:

0=Lfnddxn(Lfxn)\begin{aligned} \boxed{ 0 = \pdv{L}{f} - \sum_{n} \dv{}{x_n}\Big( \pdv{L}{f_{x_n}} \Big) } \end{aligned}

Constraints

So far, for multiple functions f1,...,fNf_1, ..., f_N, we have been assuming that all fnf_n are independent, and by extension all ηn\eta_n. Suppose that we now have M<NM < N constraints ϕm\phi_m that all fnf_n need to obey, introducing implicit dependencies between them.

Let us consider constraints ϕm\phi_m of the two forms below. It is important that they are holonomic, meaning they do not depend on any derivatives of any fn(x)f_n(x):

ϕm(f1,...,fN,x)=0orx0x1ϕm(f1,...,fN,x)dx=Cm\begin{aligned} \phi_m(f_1, ..., f_N, x) = 0 \qquad \mathrm{or} \qquad \int_{x_0}^{x_1} \phi_m(f_1, ..., f_N, x) \dd{x} = C_m \end{aligned}

Where CmC_m is a constant. Note that the first form can also be used for ϕm=Cm0\phi_m = C_m \neq 0, by simply redefining the constraint as ϕm0=ϕmCm=0\phi_m^0 = \phi_m - C_m = 0.

To solve this constrained optimization problem for fn(x)f_n(x), we introduce Lagrange multipliers λm\lambda_m. In the former case λm(x)\lambda_m(x) is a function of xx, while in the latter case λm\lambda_m is constant:

λm(x)ϕm({fn},x)dx=0orλmϕm({fn},x)dx=λmCm\begin{aligned} \int \lambda_m(x) \: \phi_m(\{f_n\}, x) \dd{x} = 0 \qquad \mathrm{or} \qquad \lambda_m \int \phi_m(\{f_n\}, x) \dd{x} = \lambda_m C_m \end{aligned}

The reason for this distinction in λm\lambda_m is that we need to find the stationary points with respect to ε\varepsilon of both constraint types. Written in the variational form, this is:

δλmϕmdx=0\begin{aligned} \delta \int \lambda_m \: \phi_m \dd{x} = 0 \end{aligned}

From this, we define a new Lagrangian Λ\Lambda for the functional JJ, with the contraints built in:

J[fn]=Λ(f1,...,fN;f1,...,fN;λ1,...,λM;x)dx=L+mλmϕmdx\begin{aligned} J[f_n] &= \int \Lambda(f_1, ..., f_N; f_1', ..., f_N'; \lambda_1, ..., \lambda_M; x) \dd{x} \\ &= \int L + \sum_{m} \lambda_m \phi_m \dd{x} \end{aligned}

Then we derive the Euler-Lagrange equation as usual for Λ\Lambda instead of LL:

0=δΛdx=Λεdx=n(Λfnfnε+Λfnfnε)dx=n(Λfnηn+Λfnηn)dx=[nΛfnηn]x0x1+nηn(Λfnddx(Λfn))dx\begin{aligned} 0 &= \delta \int \Lambda \dd{x} = \int \pdv{\Lambda}{\varepsilon} \dd{x} = \int \sum_n \Big( \pdv{\Lambda}{f_n} \pdv{f_n}{\varepsilon} + \pdv{\Lambda}{f_n'} \pdv{f_n'}{\varepsilon} \Big) \dd{x} \\ &= \int \sum_n \Big( \pdv{\Lambda}{f_n} \eta_n + \pdv{\Lambda}{f_n'} \eta_n' \Big) \dd{x} \\ &= \Big[ \sum_n \pdv{\Lambda}{f_n'} \eta_n \Big]_{x_0}^{x_1} + \int \sum_n \eta_n \bigg( \pdv{\Lambda}{f_n} - \dv{}{x}\Big( \pdv{\Lambda}{f_n'} \Big) \bigg) \dd{x} \end{aligned}

Using the same logic as before, we end up with a set of Euler-Lagrange equations with Λ\Lambda:

0=Λfnddx(Λfn)\begin{aligned} 0 = \pdv{\Lambda}{f_n} - \dv{}{x}\Big( \pdv{\Lambda}{f_n'} \Big) \end{aligned}

By inserting the definition of Λ\Lambda, we then get the following. Recall that ϕm\phi_m is holonomic, and thus independent of all derivatives fnf_n':

0=Lfnddx(Lfn)+mλmϕmfn\begin{aligned} \boxed{ 0 = \pdv{L}{f_n} - \dv{}{x}\Big( \pdv{L}{f_n'} \Big) + \sum_{m} \lambda_m \pdv{\phi_m}{f_n} } \end{aligned}

These are Lagrange’s equations of the first kind, with their second-kind counterparts being the earlier Euler-Lagrange equations. Note that there are NN separate equations, one for each fnf_n.

Due to the constraints ϕm\phi_m, the functions fnf_n are not independent. This is solved by choosing λm\lambda_m such that MM of the NN equations hold, i.e. solving a system of MM equations for λm\lambda_m:

ddx(Lfn)Lfn=mλmϕmfn\begin{aligned} \dv{}{x}\Big( \pdv{L}{f_n'} \Big) - \pdv{L}{f_n} = \sum_{m} \lambda_m \pdv{\phi_m}{f_n} \end{aligned}

And then the remaining NMN - M equations can be solved in the normal unconstrained way.

References

  1. G.B. Arfken, H.J. Weber, Mathematical methods for physicists, 6th edition, 2005, Elsevier.
  2. O. Bang, Applied mathematics for physicists: lecture notes, 2019, unpublished.