1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
|
---
title: "Lagrange multiplier"
firstLetter: "L"
publishDate: 2021-03-02
categories:
- Mathematics
- Physics
date: 2021-03-02T16:28:42+01:00
draft: false
markup: pandoc
---
# Lagrange multiplier
The method of **Lagrange multipliers** or **undetermined multipliers**
is a technique for optimizing (i.e. finding the extrema of)
a function $f(x, y, z)$,
subject to a given constraint $\phi(x, y, z) = C$,
where $C$ is a constant.
If we ignore the constraint $\phi$,
optimizing $f$ simply comes down to finding stationary points:
$$\begin{aligned}
0 &= \dd{f} = f_x \dd{x} + f_y \dd{y} + f_z \dd{z}
\end{aligned}$$
This problem is easy:
$\dd{x}$, $\dd{y}$, and $\dd{z}$ are independent and arbitrary,
so all we need to do is find the roots of
the partial derivatives $f_x$, $f_y$ and $f_z$,
which we respectively call $x_0$, $y_0$ and $z_0$,
and then the extremum is simply $(x_0, y_0, z_0)$.
But the constraint $\phi$, over which we have no control,
adds a relation between $\dd{x}$, $\dd{y}$, and $\dd{z}$,
so if two are known, the third is given by $\phi = C$.
The problem is then a system of equations:
$$\begin{aligned}
0 &= \dd{f} = f_x \dd{x} + f_y \dd{y} + f_z \dd{z}
\\
0 &= \dd{\phi} = \phi_x \dd{x} + \phi_y \dd{y} + \phi_z \dd{z}
\end{aligned}$$
Solving this directly would be a delicate balancing act
of all the partial derivatives.
To help us solve this, we introduce a "dummy" parameter $\lambda$,
the so-called **Lagrange multiplier**,
which need not be constant,
and contruct a new function $L$ given by:
$$\begin{aligned}
L(x, y, z) = f(x, y, z) + \lambda \phi(x, y, z)
\end{aligned}$$
Clearly, $\dd{L} = \dd{f} + \lambda \dd{\phi} = 0$,
so now the problem is a single equation again:
$$\begin{aligned}
0 = \dd{L}
= (f_x + \lambda \phi_x) \dd{x} + (f_y + \lambda \phi_y) \dd{y} + (f_z + \lambda \phi_z) \dd{z}
\end{aligned}$$
Assuming $\phi_z \neq 0$, we now choose $\lambda$ such that $f_z + \lambda \phi_z = 0$.
This choice represents satisfying the constraint,
so now the remaining $\dd{x}$ and $\dd{y}$ are independent again,
and we simply have to find the roots of $f_x + \lambda \phi_x$ and $f_y + \lambda \phi_y$.
This generalizes nicely to multiple constraints or more variables:
suppose that we want to find the extrema of $f(x_1, ..., x_N)$
subject to $M < N$ conditions:
$$\begin{aligned}
\phi_1(x_1, ..., x_N) = C_1 \qquad \cdots \qquad \phi_M(x_1, ..., x_N) = C_M
\end{aligned}$$
This once again turns into a delicate system of $M+1$ equations to solve:
$$\begin{aligned}
0 &= \dd{f} = f_{x_1} \dd{x_1} + ... + f_{x_N} \dd{x_N}
\\
0 &= \dd{\phi_1} = \phi_{1, x_1} \dd{x_1} + ... + \phi_{1, x_N} \dd{x_N}
\\
&\vdots
\\
0 &= \dd{\phi_M} = \phi_{M, x_1} \dd{x_1} + ... + \phi_{M, x_N} \dd{x_N}
\end{aligned}$$
Then we introduce $M$ Lagrange multipliers $\lambda_1, ..., \lambda_M$
and define $L(x_1, ..., x_N)$:
$$\begin{aligned}
L = f + \sum_{m = 1}^M \lambda_m \phi_m
\end{aligned}$$
As before, we set $\dd{L} = 0$ and choose the multipliers $\lambda_1, ..., \lambda_M$
to eliminate $M$ of its $N$ terms:
$$\begin{aligned}
0 = \dd{L}
= \sum_{n = 1}^N \Big( f_{x_n} + \sum_{m = 1}^M \lambda_m \phi_{x_n} \Big) \dd{x_n}
\end{aligned}$$
|