summaryrefslogtreecommitdiff
path: root/source/know/concept/lagrange-multiplier
diff options
context:
space:
mode:
Diffstat (limited to 'source/know/concept/lagrange-multiplier')
-rw-r--r--source/know/concept/lagrange-multiplier/index.md42
1 files changed, 18 insertions, 24 deletions
diff --git a/source/know/concept/lagrange-multiplier/index.md b/source/know/concept/lagrange-multiplier/index.md
index 9fb61a8..6b5e3fc 100644
--- a/source/know/concept/lagrange-multiplier/index.md
+++ b/source/know/concept/lagrange-multiplier/index.md
@@ -14,18 +14,18 @@ a function $$f$$ subject to **equality constraints**.
For example, in 2D, the goal is to maximize/minimize $$f(x, y)$$
while satisfying $$g(x, y) = 0$$.
We assume that $$f$$ and $$g$$ are both continuous
-and have continuous first derivatives,
-and that their domain is all of $$\mathbb{R}$$.
+and have continuous first derivatives
+on all of $$\mathbb{R}^2$$.
-Side note: many authors write that Lagrange multipliers
+Note: many authors write that Lagrange multipliers
can be used for constraints of the form $$g(x, y) = c$$ for a constant $$c$$.
-However, this method technically requires $$c = 0$$.
-This issue is easy to solve: given $$g = c$$,
+Actually, the method requires $$c = 0$$,
+but this issue is easy to solve: given $$g = c$$,
simply define $$\tilde{g} \equiv g - c = 0$$
and use that as constraint instead.
-Before introducing $$g$$,
-optimizing $$f$$ comes down to finding its stationary points:
+So, we want to optimize $$f$$.
+If we ignore $$g$$, that just means finding its stationary points:
$$\begin{aligned}
0
@@ -36,20 +36,18 @@ $$\begin{aligned}
This problem is easy: the two dimensions can be handled independently,
so all we need to do is find the roots of the partial derivatives.
-However, adding $$g$$ makes the problem much more complicated:
+However, a constraint $$g = 0$$ makes the problem much more complicated:
points with $$\nabla f = 0$$ might not satisfy $$g = 0$$,
and points where $$g = 0$$ might not have $$\nabla f = 0$$.
The dimensions also cannot be handled independently anymore,
-since they are implicitly related by $$g$$.
+since they are implicitly related via $$g$$.
Imagine a contour plot of $$g(x, y)$$.
The trick is this: if we follow a contour of $$g = 0$$,
the highest and lowest values of $$f$$ along the way
are the desired local extrema.
-Recall our assumption that $$\nabla f$$ is continuous:
-hence *along our contour* $$f$$ is slowly-varying
-in the close vicinity of each such point,
-and stationary at the point itself.
+At each such extremum, $$f$$ must be stationary from the contour's point of view,
+and slowly-varying in its close vicinity since $$\nabla f$$ is continuous.
We thus have two categories of extrema:
1. $$\nabla f = 0$$ there,
@@ -57,7 +55,7 @@ We thus have two categories of extrema:
In other words, a stationary point of $$f$$
coincidentally lies on a contour of $$g = 0$$.
-2. The contours of $$f$$ and $$g$$ are parallel around the point.
+2. The contours of $$f$$ and $$g$$ are parallel at the point.
By definition, $$f$$ is stationary along each of its contours,
so when we find that $$f$$ is stationary at a point on our $$g = 0$$ path,
it means we touched a contour of $$f$$.
@@ -83,7 +81,7 @@ $$\begin{aligned}
Where $$\lambda$$ is the **Lagrange multiplier**
that quantifies the difference in magnitude between the gradients.
By setting $$\lambda = 0$$, this equation also handles the 1st category $$\nabla f = 0$$.
-Some authors define $$\lambda$$ with the opposite sign.
+Note that some authors define $$\lambda$$ with the opposite sign.
The method of Lagrange multipliers uses these facts
to rewrite a constrained $$N$$-dimensional optimization problem
@@ -97,8 +95,7 @@ $$\begin{aligned}
}
\end{aligned}$$
-Let us do an unconstrained optimization of $$\mathcal{L}$$ as usual,
-by demanding it is stationary:
+Look what happens when we do an unconstrained optimization of $$\mathcal{L}$$ in the usual way:
$$\begin{aligned}
0
@@ -110,14 +107,11 @@ $$\begin{aligned}
The last item in this vector represents $$g = 0$$,
and the others $$\nabla f = -\lambda \nabla g$$ as discussed earlier.
-To solve this equation,
-we assign $$\lambda$$ a value that agrees with it
-(such a value exists for each local extremum
-according to our above discussion of the two categories),
-and then find the locations $$(x, y)$$ that satisfy it.
-However, as usual for optimization problems,
+When this unconstrained problem is solved using standard methods,
+the resulting solutions also satisfy the constrained problem.
+However, as usual in the field of optimization,
this method only finds *local* extrema *and* saddle points;
-it is a necessary condition for optimality, but not sufficient.
+it represents a necessary condition for optimality, but not a sufficient one.
We often assign $$\lambda$$ an algebraic expression rather than a value,
usually without even bothering to calculate its final actual value.