---
title: "Calculus of variations"
sort_title: "Calculus of variations"
date: 2021-02-24
categories:
- Mathematics
- Physics
layout: "concept"
---

The **calculus of variations** lays the mathematical groundwork
for [Lagrangian mechanics](/know/concept/lagrangian-mechanics/).

Consider a **functional** $$J$$, mapping a function $$f(x)$$ to a scalar value
by integrating over the so-called **Lagrangian** $$L$$,
which represents an expression involving $$x$$, $$f$$ and the derivative $$f'$$:

$$\begin{aligned}
    J[f] = \int_{x_0}^{x_1} L(f, f', x) \dd{x}
\end{aligned}$$

If $$J$$ in some way measures the physical "cost" (e.g. energy) of
the path $$f(x)$$ taken by a physical system,
the **principle of least action** states that $$f$$ will be a minimum of $$J[f]$$,
so for example the expended energy will be minimized.
In practice, various cost metrics may be used,
so maxima of $$J[f]$$ are also interesting to us.

If $$f(x, \varepsilon\!=\!0)$$ is the optimal route, then a slightly
different (and therefore worse) path between the same two points can be expressed
using the parameter $$\varepsilon$$:

$$\begin{aligned}
    f(x, \varepsilon) = f(x, 0) + \varepsilon \eta(x)
    \qquad \mathrm{or} \qquad
    \delta f = \varepsilon \eta(x)
\end{aligned}$$

Where $$\eta(x)$$ is an arbitrary differentiable deviation.
Since $$f(x, \varepsilon)$$ must start and end in the same points as $$f(x,0)$$,
we have the boundary conditions:

$$\begin{aligned}
    \eta(x_0) = \eta(x_1) = 0
\end{aligned}$$

Given $$L$$, the goal is to find an equation for the optimal path $$f(x,0)$$.
Just like when finding the minimum of a real function,
the minimum $$f$$ of a functional $$J[f]$$ is a stationary point
with respect to the deviation weight $$\varepsilon$$,
a condition often written as $$\delta J = 0$$.
In the following, the integration limits have been omitted:

$$\begin{aligned}
    0
    &= \delta J
    = \pdv{J}{\varepsilon} \Big|_{\varepsilon = 0}
    = \int \pdv{L}{\varepsilon} \dd{x}
    = \int \pdv{L}{f} \pdv{f}{\varepsilon} + \pdv{L}{f'} \pdv{f'}{\varepsilon} \dd{x}
    \\
    &= \int \pdv{L}{f} \eta + \pdv{L}{f'} \eta' \dd{x}
    = \Big[ \pdv{L}{f'} \eta \Big]_{x_0}^{x_1} + \int \pdv{L}{f} \eta - \dv{}{x}\Big( \pdv{L}{f'} \Big) \eta \dd{x}
\end{aligned}$$

The boundary term from partial integration vanishes due to the boundary
conditions for $$\eta(x)$$. We are thus left with:

$$\begin{aligned}
    0
    = \int \eta \bigg( \pdv{L}{f} - \dv{}{x}\Big( \pdv{L}{f'} \Big) \bigg) \dd{x}
\end{aligned}$$

This holds for all $$\eta$$, but $$\eta$$ is arbitrary, so in fact
only the parenthesized expression matters:

$$\begin{aligned}
    \boxed{
        0 = \pdv{L}{f} - \dv{}{x}\Big( \pdv{L}{f'} \Big)
    }
\end{aligned}$$

This is known as the **Euler-Lagrange equation** of the Lagrangian $$L$$,
and its solutions represent the optimal paths $$f(x, 0)$$.


## Multiple functions

Suppose that the Lagrangian $$L$$ depends on multiple independent functions
$$f_1, f_2, ..., f_N$$:

$$\begin{aligned}
    J[f_1, ..., f_N] = \int_{x_0}^{x_1} L(f_1, ..., f_N, f_1', ..., f_N', x) \dd{x}
\end{aligned}$$

In this case, every $$f_n(x)$$ has its own deviation $$\eta_n(x)$$,
satisfying $$\eta_n(x_0) = \eta_n(x_1) = 0$$:

$$\begin{aligned}
    f_n(x, \varepsilon) = f_n(x, 0) + \varepsilon \eta_n(x)
\end{aligned}$$

The derivation procedure is identical to the case $$N = 1$$ from earlier:

$$\begin{aligned}
    0
    &= \pdv{J}{\varepsilon} \Big|_{\varepsilon = 0}
    = \int \pdv{L}{\varepsilon} \dd{x}
    = \int \sum_{n} \Big( \pdv{L}{f_n} \pdv{f_n}{\varepsilon} + \pdv{L}{f_n'} \pdv{f_n'}{\varepsilon} \Big) \dd{x}
    \\
    &= \int \sum_{n} \Big( \pdv{L}{f_n} \eta_n + \pdv{L}{f_n'} \eta_n' \Big) \dd{x}
    \\
    &= \Big[ \sum_{n} \pdv{L}{f_n'} \eta_n \Big]_{x_0}^{x_1}
    + \int \sum_{n} \eta_n \bigg( \pdv{L}{f_n} - \dv{}{x}\Big( \pdv{L}{f_n'} \Big) \bigg) \dd{x}
\end{aligned}$$

Once again, $$\eta_n(x)$$ is arbitrary and disappears at the boundaries,
so we end up with $$N$$ equations of the same form as for a single function:

$$\begin{aligned}
    \boxed{
        0 = \pdv{L}{f_1} - \dv{}{x}\Big( \pdv{L}{f_1'} \Big)
        \quad \cdots \quad
        0 = \pdv{L}{f_N} - \dv{}{x}\Big( \pdv{L}{f_N'} \Big)
    }
\end{aligned}$$


## Higher-order derivatives

Suppose that the Lagrangian $$L$$ depends on multiple derivatives of $$f(x)$$:

$$\begin{aligned}
    J[f] = \int_{x_0}^{x_1} L(f, f', f'', ..., f^{(N)}, x) \dd{x}
\end{aligned}$$

Once again, the derivation procedure is the same as before:

$$\begin{aligned}
    0
    &= \pdv{J}{\varepsilon} \Big|_{\varepsilon = 0}
    = \int \pdv{L}{\varepsilon} \dd{x}
    = \int \pdv{L}{f} \pdv{f}{\varepsilon} + \sum_{n} \pdv{L}{f^{(n)}} \pdv{f^{(n)}}{\varepsilon} \dd{x}
    \\
    &= \int \pdv{L}{f} \eta + \sum_{n} \pdv{L}{f^{(n)}} \eta^{(n)} \dd{x}
\end{aligned}$$

The goal is to turn each $$\eta^{(n)}(x)$$ into $$\eta(x)$$, so we need to
partially integrate the $$n$$th term of the sum $$n$$ times. In this case,
we will need some additional boundary conditions for $$\eta(x)$$:

$$\begin{aligned}
    \eta'(x_0) = \eta'(x_1) = 0
    \qquad \cdots \qquad
    \eta^{(N-1)}(x_0) = \eta^{(N-1)}(x_1) = 0
\end{aligned}$$

This eliminates the boundary terms from partial integration, leaving:

$$\begin{aligned}
    0
    &= \int \eta \bigg( \pdv{L}{f} + \sum_{n} (-1)^n \dvn{n}{}{x}\Big( \pdv{L}{f^{(n)}} \Big) \bigg) \dd{x}
\end{aligned}$$

Once again, because $$\eta(x)$$ is arbitrary, the Euler-Lagrange equation becomes:

$$\begin{aligned}
    \boxed{
        0 = \pdv{L}{f} + \sum_{n} (-1)^n \dvn{n}{}{x}\Big( \pdv{L}{f^{(n)}} \Big)
    }
\end{aligned}$$


## Multiple coordinates

Suppose now that $$f$$ is a function of multiple variables.
For brevity, we only consider two variables $$x$$ and $$y$$,
but the results generalize effortlessly to larger amounts.
The Lagrangian now depends on all the partial derivatives of $$f(x, y)$$:

$$\begin{aligned}
    J[f] = \iint_{(x_0, y_0)}^{(x_1, y_1)} L(f, f_x, f_y, x, y) \dd{x} \dd{y}
\end{aligned}$$

The arbitrary deviation $$\eta$$ is then also a function of multiple variables:

$$\begin{aligned}
    f(x, y; \varepsilon) = f(x, y; 0) + \varepsilon \eta(x, y)
\end{aligned}$$

The derivation procedure starts in the exact same way as before:

$$\begin{aligned}
    0
    &= \pdv{J}{\varepsilon} \Big|_{\varepsilon = 0}
    = \iint \pdv{L}{\varepsilon} \dd{x} \dd{y}
    \\
    &= \iint \pdv{L}{f} \pdv{f}{\varepsilon} + \pdv{L}{f_x} \pdv{f_x}{\varepsilon} + \pdv{L}{f_y} \pdv{f_y}{\varepsilon} \dd{x} \dd{y}
    \\
    &= \iint \pdv{L}{f} \eta + \pdv{L}{f_x} \eta_x + \pdv{L}{f_y} \eta_y \dd{x} \dd{y}
\end{aligned}$$

We partially integrate for both $$\eta_x$$ and $$\eta_y$$, yielding:

$$\begin{aligned}
    0
    &= \int \Big[ \pdv{L}{f_x} \eta \Big]_{x_0}^{x_1} \dd{y} + \int \Big[ \pdv{L}{f_y} \eta \Big]_{y_0}^{y_1} \dd{x}
    \\
    &\quad + \iint \eta \bigg( \pdv{L}{f} - \dv{}{x}\Big( \pdv{L}{f_x} \Big) - \dv{}{y}\Big( \pdv{L}{f_y} \Big) \bigg) \dd{x} \dd{y}
\end{aligned}$$

But now, to eliminate these boundary terms, we need extra conditions for $$\eta$$:

$$\begin{aligned}
    \forall y: \eta(x_0, y) = \eta(x_1, y) = 0
    \qquad
    \forall x: \eta(x, y_0) = \eta(x, y_1) = 0
\end{aligned}$$

In other words, the deviation $$\eta$$ must be zero on the whole "box".
Again relying on the fact that $$\eta$$ is arbitrary, the Euler-Lagrange
equation is:

$$\begin{aligned}
    0 = \pdv{L}{f} - \dv{}{x}\Big( \pdv{L}{f_x} \Big) - \dv{}{y}\Big( \pdv{L}{f_y} \Big)
\end{aligned}$$

This generalizes nicely to functions of even more variables $$x_1, x_2, ..., x_N$$:

$$\begin{aligned}
    \boxed{
        0 = \pdv{L}{f} - \sum_{n} \dv{}{x_n}\Big( \pdv{L}{f_{x_n}} \Big)
    }
\end{aligned}$$


## Constraints

So far, for multiple functions $$f_1, ..., f_N$$,
we have been assuming that all $$f_n$$ are independent, and by extension all $$\eta_n$$.
Suppose that we now have $$M < N$$ constraints $$\phi_m$$
that all $$f_n$$ need to obey, introducing implicit dependencies between them.

Let us consider constraints $$\phi_m$$ of the two forms below.
It is important that they are **holonomic**,
meaning they do not depend on any derivatives of any $$f_n(x)$$:

$$\begin{aligned}
    \phi_m(f_1, ..., f_N, x) = 0
    \qquad \mathrm{or} \qquad
    \int_{x_0}^{x_1} \phi_m(f_1, ..., f_N, x) \dd{x} = C_m
\end{aligned}$$

Where $$C_m$$ is a constant.
Note that the first form can also be used for $$\phi_m = C_m \neq 0$$,
by simply redefining the constraint as $$\phi_m^0 = \phi_m - C_m = 0$$.

To solve this constrained optimization problem for $$f_n(x)$$,
we introduce [Lagrange multipliers](/know/concept/lagrange-multiplier/) $$\lambda_m$$.
In the former case $$\lambda_m(x)$$ is a function of $$x$$, while in the
latter case $$\lambda_m$$ is constant:

$$\begin{aligned}
    \int \lambda_m(x) \: \phi_m(\{f_n\}, x) \dd{x} = 0
    \qquad \mathrm{or} \qquad
    \lambda_m \int \phi_m(\{f_n\}, x) \dd{x} = \lambda_m C_m
\end{aligned}$$

The reason for this distinction in $$\lambda_m$$
is that we need to find the stationary points with respect to $$\varepsilon$$
of both constraint types. Written in the variational form, this is:

$$\begin{aligned}
    \delta \int \lambda_m \: \phi_m \dd{x} = 0
\end{aligned}$$

From this, we define a new Lagrangian $$\Lambda$$ for the functional $$J$$,
with the contraints built in:

$$\begin{aligned}
    J[f_n]
    &= \int \Lambda(f_1, ..., f_N; f_1', ..., f_N'; \lambda_1, ..., \lambda_M; x) \dd{x}
    \\
    &= \int L + \sum_{m} \lambda_m \phi_m \dd{x}
\end{aligned}$$

Then we derive the Euler-Lagrange equation as usual for $$\Lambda$$ instead of $$L$$:

$$\begin{aligned}
    0
    &= \delta \int \Lambda \dd{x}
    = \int \pdv{\Lambda}{\varepsilon} \dd{x}
    = \int \sum_n \Big( \pdv{\Lambda}{f_n} \pdv{f_n}{\varepsilon} + \pdv{\Lambda}{f_n'} \pdv{f_n'}{\varepsilon} \Big) \dd{x}
    \\
    &= \int \sum_n \Big( \pdv{\Lambda}{f_n} \eta_n + \pdv{\Lambda}{f_n'} \eta_n' \Big) \dd{x}
    \\
    &= \Big[ \sum_n \pdv{\Lambda}{f_n'} \eta_n \Big]_{x_0}^{x_1}
    + \int \sum_n \eta_n \bigg( \pdv{\Lambda}{f_n} - \dv{}{x}\Big( \pdv{\Lambda}{f_n'} \Big) \bigg) \dd{x}
\end{aligned}$$

Using the same logic as before, we end up with a set of Euler-Lagrange equations with $$\Lambda$$:

$$\begin{aligned}
    0
    = \pdv{\Lambda}{f_n} - \dv{}{x}\Big( \pdv{\Lambda}{f_n'} \Big)
\end{aligned}$$

By inserting the definition of $$\Lambda$$, we then get the following.
Recall that $$\phi_m$$ is holonomic, and thus independent of all derivatives $$f_n'$$:

$$\begin{aligned}
    \boxed{
        0
        = \pdv{L}{f_n} - \dv{}{x}\Big( \pdv{L}{f_n'} \Big) + \sum_{m} \lambda_m \pdv{\phi_m}{f_n}
    }
\end{aligned}$$

These are **Lagrange's equations of the first kind**,
with their second-kind counterparts being the earlier Euler-Lagrange equations.
Note that there are $$N$$ separate equations, one for each $$f_n$$.

Due to the constraints $$\phi_m$$, the functions $$f_n$$ are not independent.
This is solved by choosing $$\lambda_m$$ such that $$M$$ of the $$N$$ equations hold,
i.e. solving a system of $$M$$ equations for $$\lambda_m$$:

$$\begin{aligned}
    \dv{}{x}\Big( \pdv{L}{f_n'} \Big) - \pdv{L}{f_n}
    = \sum_{m} \lambda_m \pdv{\phi_m}{f_n}
\end{aligned}$$

And then the remaining $$N - M$$ equations can be solved in the normal unconstrained way.


## References
1.  G.B. Arfken, H.J. Weber,
    *Mathematical methods for physicists*, 6th edition, 2005,
    Elsevier.
2.  O. Bang,
    *Applied mathematics for physicists: lecture notes*, 2019,
    unpublished.