Categories: Mathematics, Physics.

Lagrange multiplier

The method of Lagrange multipliers or undetermined multipliers is a technique for optimizing (i.e. finding the extrema of) a function \(f(x, y, z)\), subject to a given constraint \(\phi(x, y, z) = C\), where \(C\) is a constant.

If we ignore the constraint \(\phi\), optimizing \(f\) simply comes down to finding stationary points:

\[\begin{aligned} 0 &= \dd{f} = f_x \dd{x} + f_y \dd{y} + f_z \dd{z} \end{aligned}\]

This problem is easy: \(\dd{x}\), \(\dd{y}\), and \(\dd{z}\) are independent and arbitrary, so all we need to do is find the roots of the partial derivatives \(f_x\), \(f_y\) and \(f_z\), which we respectively call \(x_0\), \(y_0\) and \(z_0\), and then the extremum is simply \((x_0, y_0, z_0)\).

But the constraint \(\phi\), over which we have no control, adds a relation between \(\dd{x}\), \(\dd{y}\), and \(\dd{z}\), so if two are known, the third is given by \(\phi = C\). The problem is then a system of equations:

\[\begin{aligned} 0 &= \dd{f} = f_x \dd{x} + f_y \dd{y} + f_z \dd{z} \\ 0 &= \dd{\phi} = \phi_x \dd{x} + \phi_y \dd{y} + \phi_z \dd{z} \end{aligned}\]

Solving this directly would be a delicate balancing act of all the partial derivatives.

To help us solve this, we introduce a “dummy” parameter \(\lambda\), the so-called Lagrange multiplier, and contruct a new function \(L\) given by:

\[\begin{aligned} L(x, y, z) = f(x, y, z) + \lambda \phi(x, y, z) \end{aligned}\]

At the extremum, \(\dd{L} = \dd{f} + \lambda \dd{\phi} = 0\), so now the problem is a “single” equation again:

\[\begin{aligned} 0 = \dd{L} = (f_x + \lambda \phi_x) \dd{x} + (f_y + \lambda \phi_y) \dd{y} + (f_z + \lambda \phi_z) \dd{z} \end{aligned}\]

Assuming \(\phi_z \neq 0\), we now choose \(\lambda\) such that \(f_z + \lambda \phi_z = 0\). This choice represents satisfying the constraint, so now the remaining \(\dd{x}\) and \(\dd{y}\) are independent again, and we simply have to find the roots of \(f_x + \lambda \phi_x\) and \(f_y + \lambda \phi_y\).

In effect, after introducing \(\lambda\), we have four unknowns \((x, y, z, \lambda)\), but also four equations:

\[\begin{aligned} L_x = L_y = L_z = 0 \qquad \quad \phi = C \end{aligned}\]

We are only really interested in the first three unknowns \((x, y, z)\), so \(\lambda\) is sometimes called the undetermined multiplier, since it is just an algebraic helper whose value is irrelevant.

This method generalizes nicely to multiple constraints or more variables: suppose that we want to find the extrema of \(f(x_1, ..., x_N)\) subject to \(M < N\) conditions:

\[\begin{aligned} \phi_1(x_1, ..., x_N) = C_1 \qquad \cdots \qquad \phi_M(x_1, ..., x_N) = C_M \end{aligned}\]

This once again turns into a delicate system of \(M+1\) equations to solve:

\[\begin{aligned} 0 &= \dd{f} = f_{x_1} \dd{x_1} + ... + f_{x_N} \dd{x_N} \\ 0 &= \dd{\phi_1} = \phi_{1, x_1} \dd{x_1} + ... + \phi_{1, x_N} \dd{x_N} \\ &\vdots \\ 0 &= \dd{\phi_M} = \phi_{M, x_1} \dd{x_1} + ... + \phi_{M, x_N} \dd{x_N} \end{aligned}\]

Then we introduce \(M\) Lagrange multipliers \(\lambda_1, ..., \lambda_M\) and define \(L(x_1, ..., x_N)\):

\[\begin{aligned} L = f + \sum_{m = 1}^M \lambda_m \phi_m \end{aligned}\]

As before, we set \(\dd{L} = 0\) and choose the multipliers \(\lambda_1, ..., \lambda_M\) to eliminate \(M\) of its \(N\) terms:

\[\begin{aligned} 0 = \dd{L} = \sum_{n = 1}^N \Big( f_{x_n} + \sum_{m = 1}^M \lambda_m \phi_{x_n} \Big) \dd{x_n} \end{aligned}\]

References

  1. G.B. Arfken, H.J. Weber, Mathematical methods for physicists, 6th edition, 2005, Elsevier.
  2. O. Bang, Applied mathematics for physicists: lecture notes, 2019, unpublished.

© "Prefetch". Licensed under CC BY-SA 4.0.
uses