summaryrefslogtreecommitdiff
path: root/source/know/concept/lagrange-multiplier/index.md
blob: 8ee1054d8c8d4866f515053b09bdba5e09082137 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
---
title: "Lagrange multiplier"
sort_title: "Lagrange multiplier"
date: 2021-03-02
categories:
- Mathematics
- Physics
layout: "concept"
---

The method of **Lagrange multipliers** or **undetermined multipliers**
is a technique for optimizing (i.e. finding the extrema of)
a function $$f(x, y, z)$$,
subject to a given constraint $$\phi(x, y, z) = C$$,
where $$C$$ is a constant.

If we ignore the constraint $$\phi$$,
optimizing $$f$$ simply comes down to finding stationary points:

$$\begin{aligned}
    0 &= \dd{f} = f_x \dd{x} + f_y \dd{y} + f_z \dd{z}
\end{aligned}$$

This problem is easy:
$$\dd{x}$$, $$\dd{y}$$, and $$\dd{z}$$ are independent and arbitrary,
so all we need to do is find the roots of
the partial derivatives $$f_x$$, $$f_y$$ and $$f_z$$,
which we respectively call $$x_0$$, $$y_0$$ and $$z_0$$,
and then the extremum is simply $$(x_0, y_0, z_0)$$.

But the constraint $$\phi$$, over which we have no control,
adds a relation between $$\dd{x}$$, $$\dd{y}$$, and $$\dd{z}$$,
so if two are known, the third is given by $$\phi = C$$.
The problem is then a system of equations:

$$\begin{aligned}
    0 &= \dd{f} = f_x \dd{x} + f_y \dd{y} + f_z \dd{z}
    \\
    0 &= \dd{\phi} = \phi_x \dd{x} + \phi_y \dd{y} + \phi_z \dd{z}
\end{aligned}$$

Solving this directly would be a delicate balancing act
of all the partial derivatives.

To help us solve this, we introduce a "dummy" parameter $$\lambda$$,
the so-called **Lagrange multiplier**,
and contruct a new function $$L$$ given by:

$$\begin{aligned}
    L(x, y, z) = f(x, y, z) + \lambda \phi(x, y, z)
\end{aligned}$$

At the extremum, $$\dd{L} = \dd{f} + \lambda \dd{\phi} = 0$$,
so now the problem is a "single" equation again:

$$\begin{aligned}
    0 = \dd{L}
    = (f_x + \lambda \phi_x) \dd{x} + (f_y + \lambda \phi_y) \dd{y} + (f_z + \lambda \phi_z) \dd{z}
\end{aligned}$$

Assuming $$\phi_z \neq 0$$, we now choose $$\lambda$$ such that $$f_z + \lambda \phi_z = 0$$.
This choice represents satisfying the constraint,
so now the remaining $$\dd{x}$$ and $$\dd{y}$$ are independent again,
and we simply have to find the roots of $$f_x + \lambda \phi_x$$ and $$f_y + \lambda \phi_y$$.

In effect, after introducing $$\lambda$$,
we have four unknowns $$(x, y, z, \lambda)$$,
but also four equations:

$$\begin{aligned}
    L_x = L_y = L_z = 0
    \qquad \quad
    \phi = C
\end{aligned}$$

We are only really interested in the first three unknowns $$(x, y, z)$$,
so $$\lambda$$ is sometimes called the **undetermined multiplier**,
since it is just an algebraic helper whose value is irrelevant.

This method generalizes nicely to multiple constraints or more variables:
suppose that we want to find the extrema of $$f(x_1, ..., x_N)$$
subject to $$M < N$$ conditions:

$$\begin{aligned}
    \phi_1(x_1, ..., x_N) = C_1 \qquad \cdots \qquad \phi_M(x_1, ..., x_N) = C_M
\end{aligned}$$

This once again turns into a delicate system of $$M+1$$ equations to solve:

$$\begin{aligned}
    0 &= \dd{f} = f_{x_1} \dd{x_1} + ... + f_{x_N} \dd{x_N}
    \\
    0 &= \dd{\phi_1} = \phi_{1, x_1} \dd{x_1} + ... + \phi_{1, x_N} \dd{x_N}
    \\
    &\vdots
    \\
    0 &= \dd{\phi_M} = \phi_{M, x_1} \dd{x_1} + ... + \phi_{M, x_N} \dd{x_N}
\end{aligned}$$

Then we introduce $$M$$ Lagrange multipliers $$\lambda_1, ..., \lambda_M$$
and define $$L(x_1, ..., x_N)$$:

$$\begin{aligned}
    L = f + \sum_{m = 1}^M \lambda_m \phi_m
\end{aligned}$$

As before, we set $$\dd{L} = 0$$ and choose the multipliers $$\lambda_1, ..., \lambda_M$$
to eliminate $$M$$ of its $$N$$ terms:

$$\begin{aligned}
    0 = \dd{L}
    = \sum_{n = 1}^N \Big( f_{x_n} + \sum_{m = 1}^M \lambda_m \phi_{x_n} \Big) \dd{x_n}
\end{aligned}$$


## References
1.  G.B. Arfken, H.J. Weber,
    *Mathematical methods for physicists*, 6th edition, 2005,
    Elsevier.
2.  O. Bang,
    *Applied mathematics for physicists: lecture notes*, 2019,
    unpublished.