From a8d31faecc733fa4d63fde58ab98a5e9d11029c2 Mon Sep 17 00:00:00 2001 From: Prefetch Date: Sun, 2 Apr 2023 16:57:12 +0200 Subject: Improve knowledge base --- source/know/concept/lagrange-multiplier/index.md | 42 ++++++++++-------------- 1 file changed, 18 insertions(+), 24 deletions(-) (limited to 'source/know/concept/lagrange-multiplier') diff --git a/source/know/concept/lagrange-multiplier/index.md b/source/know/concept/lagrange-multiplier/index.md index 9fb61a8..6b5e3fc 100644 --- a/source/know/concept/lagrange-multiplier/index.md +++ b/source/know/concept/lagrange-multiplier/index.md @@ -14,18 +14,18 @@ a function $$f$$ subject to **equality constraints**. For example, in 2D, the goal is to maximize/minimize $$f(x, y)$$ while satisfying $$g(x, y) = 0$$. We assume that $$f$$ and $$g$$ are both continuous -and have continuous first derivatives, -and that their domain is all of $$\mathbb{R}$$. +and have continuous first derivatives +on all of $$\mathbb{R}^2$$. -Side note: many authors write that Lagrange multipliers +Note: many authors write that Lagrange multipliers can be used for constraints of the form $$g(x, y) = c$$ for a constant $$c$$. -However, this method technically requires $$c = 0$$. -This issue is easy to solve: given $$g = c$$, +Actually, the method requires $$c = 0$$, +but this issue is easy to solve: given $$g = c$$, simply define $$\tilde{g} \equiv g - c = 0$$ and use that as constraint instead. -Before introducing $$g$$, -optimizing $$f$$ comes down to finding its stationary points: +So, we want to optimize $$f$$. +If we ignore $$g$$, that just means finding its stationary points: $$\begin{aligned} 0 @@ -36,20 +36,18 @@ $$\begin{aligned} This problem is easy: the two dimensions can be handled independently, so all we need to do is find the roots of the partial derivatives. -However, adding $$g$$ makes the problem much more complicated: +However, a constraint $$g = 0$$ makes the problem much more complicated: points with $$\nabla f = 0$$ might not satisfy $$g = 0$$, and points where $$g = 0$$ might not have $$\nabla f = 0$$. The dimensions also cannot be handled independently anymore, -since they are implicitly related by $$g$$. +since they are implicitly related via $$g$$. Imagine a contour plot of $$g(x, y)$$. The trick is this: if we follow a contour of $$g = 0$$, the highest and lowest values of $$f$$ along the way are the desired local extrema. -Recall our assumption that $$\nabla f$$ is continuous: -hence *along our contour* $$f$$ is slowly-varying -in the close vicinity of each such point, -and stationary at the point itself. +At each such extremum, $$f$$ must be stationary from the contour's point of view, +and slowly-varying in its close vicinity since $$\nabla f$$ is continuous. We thus have two categories of extrema: 1. $$\nabla f = 0$$ there, @@ -57,7 +55,7 @@ We thus have two categories of extrema: In other words, a stationary point of $$f$$ coincidentally lies on a contour of $$g = 0$$. -2. The contours of $$f$$ and $$g$$ are parallel around the point. +2. The contours of $$f$$ and $$g$$ are parallel at the point. By definition, $$f$$ is stationary along each of its contours, so when we find that $$f$$ is stationary at a point on our $$g = 0$$ path, it means we touched a contour of $$f$$. @@ -83,7 +81,7 @@ $$\begin{aligned} Where $$\lambda$$ is the **Lagrange multiplier** that quantifies the difference in magnitude between the gradients. By setting $$\lambda = 0$$, this equation also handles the 1st category $$\nabla f = 0$$. -Some authors define $$\lambda$$ with the opposite sign. +Note that some authors define $$\lambda$$ with the opposite sign. The method of Lagrange multipliers uses these facts to rewrite a constrained $$N$$-dimensional optimization problem @@ -97,8 +95,7 @@ $$\begin{aligned} } \end{aligned}$$ -Let us do an unconstrained optimization of $$\mathcal{L}$$ as usual, -by demanding it is stationary: +Look what happens when we do an unconstrained optimization of $$\mathcal{L}$$ in the usual way: $$\begin{aligned} 0 @@ -110,14 +107,11 @@ $$\begin{aligned} The last item in this vector represents $$g = 0$$, and the others $$\nabla f = -\lambda \nabla g$$ as discussed earlier. -To solve this equation, -we assign $$\lambda$$ a value that agrees with it -(such a value exists for each local extremum -according to our above discussion of the two categories), -and then find the locations $$(x, y)$$ that satisfy it. -However, as usual for optimization problems, +When this unconstrained problem is solved using standard methods, +the resulting solutions also satisfy the constrained problem. +However, as usual in the field of optimization, this method only finds *local* extrema *and* saddle points; -it is a necessary condition for optimality, but not sufficient. +it represents a necessary condition for optimality, but not a sufficient one. We often assign $$\lambda$$ an algebraic expression rather than a value, usually without even bothering to calculate its final actual value. -- cgit v1.2.3