From a8d31faecc733fa4d63fde58ab98a5e9d11029c2 Mon Sep 17 00:00:00 2001
From: Prefetch
Date: Sun, 2 Apr 2023 16:57:12 +0200
Subject: Improve knowledge base

---
 source/know/concept/lagrange-multiplier/index.md | 42 ++++++++++--------------
 1 file changed, 18 insertions(+), 24 deletions(-)

(limited to 'source/know/concept/lagrange-multiplier')

diff --git a/source/know/concept/lagrange-multiplier/index.md b/source/know/concept/lagrange-multiplier/index.md
index 9fb61a8..6b5e3fc 100644
--- a/source/know/concept/lagrange-multiplier/index.md
+++ b/source/know/concept/lagrange-multiplier/index.md
@@ -14,18 +14,18 @@ a function $$f$$ subject to **equality constraints**.
 For example, in 2D, the goal is to maximize/minimize $$f(x, y)$$
 while satisfying $$g(x, y) = 0$$.
 We assume that $$f$$ and $$g$$ are both continuous
-and have continuous first derivatives,
-and that their domain is all of $$\mathbb{R}$$.
+and have continuous first derivatives
+on all of $$\mathbb{R}^2$$.
 
-Side note: many authors write that Lagrange multipliers
+Note: many authors write that Lagrange multipliers
 can be used for constraints of the form $$g(x, y) = c$$ for a constant $$c$$.
-However, this method technically requires $$c = 0$$.
-This issue is easy to solve: given $$g = c$$,
+Actually, the method requires $$c = 0$$,
+but this issue is easy to solve: given $$g = c$$,
 simply define $$\tilde{g} \equiv g - c = 0$$
 and use that as constraint instead.
 
-Before introducing $$g$$,
-optimizing $$f$$ comes down to finding its stationary points:
+So, we want to optimize $$f$$.
+If we ignore $$g$$, that just means finding its stationary points:
 
 $$\begin{aligned}
     0
@@ -36,20 +36,18 @@ $$\begin{aligned}
 This problem is easy: the two dimensions can be handled independently,
 so all we need to do is find the roots of the partial derivatives.
 
-However, adding $$g$$ makes the problem much more complicated:
+However, a constraint $$g = 0$$ makes the problem much more complicated:
 points with $$\nabla f = 0$$ might not satisfy $$g = 0$$,
 and points where $$g = 0$$ might not have $$\nabla f = 0$$.
 The dimensions also cannot be handled independently anymore,
-since they are implicitly related by $$g$$.
+since they are implicitly related via $$g$$.
 
 Imagine a contour plot of $$g(x, y)$$.
 The trick is this: if we follow a contour of $$g = 0$$,
 the highest and lowest values of $$f$$ along the way
 are the desired local extrema.
-Recall our assumption that $$\nabla f$$ is continuous:
-hence *along our contour* $$f$$ is slowly-varying
-in the close vicinity of each such point,
-and stationary at the point itself.
+At each such extremum, $$f$$ must be stationary from the contour's point of view,
+and slowly-varying in its close vicinity since $$\nabla f$$ is continuous.
 We thus have two categories of extrema:
 
 1.  $$\nabla f = 0$$ there,
@@ -57,7 +55,7 @@ We thus have two categories of extrema:
     In other words, a stationary point of $$f$$
     coincidentally lies on a contour of $$g = 0$$.
 
-2.  The contours of $$f$$ and $$g$$ are parallel around the point.
+2.  The contours of $$f$$ and $$g$$ are parallel at the point.
     By definition, $$f$$ is stationary along each of its contours,
     so when we find that $$f$$ is stationary at a point on our $$g = 0$$ path,
     it means we touched a contour of $$f$$.
@@ -83,7 +81,7 @@ $$\begin{aligned}
 Where $$\lambda$$ is the **Lagrange multiplier**
 that quantifies the difference in magnitude between the gradients.
 By setting $$\lambda = 0$$, this equation also handles the 1st category $$\nabla f = 0$$.
-Some authors define $$\lambda$$ with the opposite sign.
+Note that some authors define $$\lambda$$ with the opposite sign.
 
 The method of Lagrange multipliers uses these facts
 to rewrite a constrained $$N$$-dimensional optimization problem
@@ -97,8 +95,7 @@ $$\begin{aligned}
     }
 \end{aligned}$$
 
-Let us do an unconstrained optimization of $$\mathcal{L}$$ as usual,
-by demanding it is stationary:
+Look what happens when we do an unconstrained optimization of $$\mathcal{L}$$ in the usual way:
 
 $$\begin{aligned}
     0
@@ -110,14 +107,11 @@ $$\begin{aligned}
 
 The last item in this vector represents $$g = 0$$,
 and the others $$\nabla f = -\lambda \nabla g$$ as discussed earlier.
-To solve this equation,
-we assign $$\lambda$$ a value that agrees with it
-(such a value exists for each local extremum
-according to our above discussion of the two categories),
-and then find the locations $$(x, y)$$ that satisfy it.
-However, as usual for optimization problems,
+When this unconstrained problem is solved using standard methods,
+the resulting solutions also satisfy the constrained problem.
+However, as usual in the field of optimization,
 this method only finds *local* extrema *and* saddle points;
-it is a necessary condition for optimality, but not sufficient.
+it represents a necessary condition for optimality, but not a sufficient one.
 
 We often assign $$\lambda$$ an algebraic expression rather than a value,
 usually without even bothering to calculate its final actual value.
-- 
cgit v1.2.3