Categories: Mathematics, Physics.
Lagrange multiplier
The method of Lagrange multipliers or undetermined multipliers is a technique for optimizing (i.e. finding extrema of) a function subject to equality constraints. For example, in 2D, the goal is to maximize/minimize while satisfying . We assume that and are both continuous and have continuous first derivatives on all of .
Note: many authors write that Lagrange multipliers can be used for constraints of the form for a constant . Actually, the method requires , but this issue is easy to solve: given , simply define and use that as constraint instead.
So, we want to optimize . If we ignore , that just means finding its stationary points:
This problem is easy: the two dimensions can be handled independently, so all we need to do is find the roots of the partial derivatives.
However, a constraint makes the problem much more complicated: points with might not satisfy , and points where might not have . The dimensions also cannot be handled independently anymore, since they are implicitly related via .
Imagine a contour plot of . The trick is this: if we follow a contour of , the highest and lowest values of along the way are the desired local extrema. At each such extremum, must be stationary from the contour’s point of view, and slowly-varying in its close vicinity since is continuous. We thus have two categories of extrema:
-
there, i.e. is slowly-varying along all directions around the point. In other words, a stationary point of coincidentally lies on a contour of .
-
The contours of and are parallel at the point. By definition, is stationary along each of its contours, so when we find that is stationary at a point on our path, it means we touched a contour of . Obviously, each point of lies on some contour, but if they are not parallel, then is increasing or decreasing along our path, so this is not an extremum and we must continue our search.
What about the edge case that and in the same point, i.e. we locally have no contour to follow? Do we just take whatever value has there? No, by convention, we do not, because this does not really count as optimizing .
Now, in the 2nd category, parallel contours imply parallel gradients, i.e. and differ only in magnitude, not direction. Formally:
Where is the Lagrange multiplier that quantifies the difference in magnitude between the gradients. By setting , this equation also handles the 1st category . Note that some authors define with the opposite sign.
The method of Lagrange multipliers uses these facts to rewrite a constrained -dimensional optimization problem as an unconstrained -dimensional optimization problem by defining the Lagrangian function as follows:
Look what happens when we do an unconstrained optimization of in the usual way:
The last item in this vector represents , and the others as discussed earlier. When this unconstrained problem is solved using standard methods, the resulting solutions also satisfy the constrained problem. However, as usual in the field of optimization, this method only finds local extrema and saddle points; it represents a necessary condition for optimality, but not a sufficient one.
We often assign an algebraic expression rather than a value, usually without even bothering to calculate its final actual value. In fact, in some cases, ’s only function is to help us reason about the interdependence of a system of equations (see Wikipedia’s entropy example); then is not even given an expression! Hence it is sometimes also called an undetermined multiplier.
This does not imply that is meaningless; it often represents a quantity of interest. In general, defining so that the constraint is , we see that the Lagrange multiplier represents the rate of change of with respect to the value being constrained:
The method of Lagrange multipliers generalizes nicely to more constraints or more variables. Suppose we want to find extrema of subject to conditions:
Then we introduce Lagrange multipliers and define :
As before, we set and choose the multipliers to satisfy the resulting system of 1D equations, and then find the coordinates of the extrema.
References
- G.B. Arfken, H.J. Weber, Mathematical methods for physicists, 6th edition, 2005, Elsevier.
- O. Bang, Applied mathematics for physicists: lecture notes, 2019, unpublished.