2.8: Optimization

\(\newcommand{\R}{\mathbb R }\)

2.8: Optimization

  1. Lagrange multipliers for constrained optimization
  2. Other optimization problems
  3. Problems

\(\Leftarrow\)  \(\Uparrow\)  \(\Rightarrow\)

Lagrange multipliers for constrained optimization

If I had been rich, I probably would not have devoted myself to mathematics. - Joseph-Louis Lagrange

Consider the problem \[\begin{equation} \left\{\begin{array}{r} \text{minimize/maximize }\ \ \ f(\mathbf x)\qquad \\ \text{ subject to the constraint: }\ \ g(\mathbf x)=0. \end{array}\right. \label{con1}\end{equation}\]

A point \(\mathbf x\) is said to be a local maximum of \(f\) subject to the constraint \(g= 0\) if \[ \exists r>0\text{ such that }f(\mathbf x)\ge f(\mathbf x')\text{ for all }\mathbf x' \text{ such that } |\mathbf x'-\mathbf x| < r \text{ and }g(\mathbf x')=0. \] The definition of a local minimum of a constraint problem is the same, with \(\ge\) changed to \(\le\).

This is the same as saying that \[ \exists \text{ an open set }U\text{ containing }\mathbf x,\text{ and such that }f(\mathbf x)\ge f(\mathbf x')\text{ for all }\mathbf x'\in U \cap g^{-1}(0). \]

Suppose that \(f\) and \(g\) are functions \(S\to \R\) of class \(C^1\), where \(S\) is an open subset of \(\R^n\). If \(\mathbf x\) is a local minimum point or local maximum point of \(f\) subject to the constraint \(g=0\), and if \(\nabla g(\mathbf x) \ne {\bf 0}\), then there exists \(\lambda\in \R\) such that the following system of equations is satisfied by \(\mathbf x\) and \(\lambda\): \[\begin{equation} \left\{ \begin{array} {r} \nabla f(\mathbf x) +\lambda \nabla g(\mathbf x) = \bf0,\\ g(\mathbf x) = 0. \end{array} \right. \label{lm1}\end{equation}\]

These equations are saying that any local min/max point \(\mathbf x\) satisfying \(g(\mathbf x)=0\), must have the gradients of \(f\) and \(g\) be linearly dependent vectors. The real number \(\lambda\) tells us the dependence. This variable \(\lambda\) is called the Lagrange multiplier, after Joseph-Louis Lagrange (Italy-France, 1736-1813). Lagrange begin studying math for its applications in physics, then proved results like this in calculus to extend his understanding. He also helped develop the metric system, and proved that every positive integer is a sum of \(4\) squares of integers.

If \(\mathbf x\) is any solution of \(\eqref{con1}\), where \(\nabla g(\mathbf x)\neq \mathbf 0\) then the equations \(\eqref{lm1}\) hold. Note that if \(\nabla g(\mathbf x)={\bf 0}\), then the gradients will be dependent, but cannot generate equations like \(\eqref{lm1}\) unless \(\nabla f(\mathbf x)=\mathbf 0\). Try to give an example of this.

You will prove this theorem in a later homework assignment.

For many constrained optimization problems, it is the case that \[\begin{equation}\label{nondeg} \nabla g \ne {\bf 0} \text{ on the set } \{ \mathbf x\in \R^n : g(\mathbf x) = 0\}. \end{equation}\] If this holds, then every local minimum or maximum point satisfies \(\eqref{lm1}\).

Note that \(\eqref{lm1}\) is a system of \(n+1\) equations \[ \left\{ \begin{array} {r} \partial_1 f(\mathbf x) +\lambda \partial_1 g(\mathbf x) = 0,\\ \partial_2 f(\mathbf x) +\lambda \partial_2 g(\mathbf x) = 0,\\ \vdots\qquad\\ \partial_n f(\mathbf x) +\lambda \partial_n g(\mathbf x) = 0,\\ g(\mathbf x) = 0. \end{array} \right. \] with \(n+1\) unknowns, the \(n\) components \(x_1, \ldots, x_n\) of \(\mathbf x\) and \(\lambda\).

Example 1.

Consider the problem \[ \left. \begin{array}{rl} \text{minimize/maximize}\ \ &f(x,y) = y\\ \text{ subject to the constraint:}\ \ &g(x,y) = 0 \end{array}\right. \] where the set \(S = \{ (x,y) : g(x,y) = 0\}\) is the curve shown in the picture below. We suppose that \(g\) satisfies condition \(\eqref{nondeg}\).

drawing

We can informally think of this problem as asking us to find the “northernmost” point on the curve (maximum of \(f(x,y) = y\)) and the “southernmost” point on the curve (minimum of \(f(x,y)= y\).) It is easy to see where these are.

What do the Lagrange multipler equations say? Clearly, \(\nabla f = \binom 0 1\) everywhere. These vectors are pictured in green below.

drawing

So the Lagrange multiplier equations say \[\begin{align*} \binom 0 1 + \lambda \nabla g(\mathbf x) &=0 \\ g(\mathbf x) &= 0. \end{align*}\] The first equation cannot be satisfied if \(\lambda = 0\). Therefore we can divide by \(\lambda\), and the first equation says that \(\nabla g(\mathbf x)\) is a multiple of the vector \(\binom 01\).

To understand what this means, recall the important fact that \(\nabla g(\mathbf x)\) is orthogonal to the level set of \(g\) passing through \(\mathbf x.\) The directions of \(\nabla g\) at points on the curve \(S\) are shown in red in the picture below:

drawing

In the picture, \(g\) increases as one moves from the interior to the exterior of the curve – otherwise the red arrows would all point in the opposite direction. This does not effect which points are scalar multiples of \(\binom 01\).

So the Lagrange multiplier equations are satisfied at points where the red vector (the normal to the curve) is parallel to the green vector. This is the same as saying that the level set of \(g\) (i.e. the curve where the constraint is satisfied) is orthogonal to \(\binom 01\). These points are shown yellow in the picture below. The northernmost and southernmost points are the maximum and minimum respectively.

drawing

From this example, we can understand more generally the “meaning” of the Lagrange multiplier equations, and we can also understand why the theorem makes sense. The equations imply for example that at the northernmost point on the curve, the curve is oriented exactly in an east-west direction. This is intuitively clear, since at any point where the curve is not oriented exactly east-west, we could get a little farther to the north by moving in one or the other direction along the curve.

Generally, the same considerations apply to any \(C^1\) functions \(f\) and \(g\). That is,
\[ \boxed{\begin{array}{l} \text{the Lagrange multiplier equations}\ \ \Longrightarrow \\ \hspace{5em} \nabla f(\mathbf x) \text{ is orthogonal at }\mathbf x\text{ to the set of points satisfying the constraint.} \end{array}} \] You may be able to convince yourself that if \(\nabla f\) is not orthogonal to the level set of \(g\), then it should be possible to move a little bit, in a way that makes \(f\) larger or smaller but continues to satisfy the constraint. The underlying idea is exactly the same as in Example 1.

Example 2.

Solve the problem \[ \left. \begin{array}{rl} \text{minimize/maximize}\ \ &f(x,y) = x(1-y^2)\\ \text{ subject to the constraint:}\ \ &g(x,y) = x^2+ y^2 - 1 = 0. \end{array}\right. \]

Solution. First, note that \(f\) is continuous, and the set defined by the constraint is compact. Thus, the Extreme Value Theorem guarantees that the min and max are achieved. Also, \(\nabla g(x,y) = (2x, 2y)\). Thus \(\nabla g = {\bf 0}\) only at the origin, which does not satisfy the constraint, so condition \(\eqref{nondeg}\) holds. Thus the Lagrange multiplier equations are guaranteed to be satisfied at the points where the min and max occur.

The Lagrange multiplier equations are \[\begin{align} (1-y^2) + \lambda2x &= 0 \nonumber \\ -2xy+ \lambda 2y &=0 \nonumber \\ x^2+y^2&=1.\nonumber \end{align}\] The middle equation states that \(2 y (\lambda-x) = 0\). This can only hold if \(y=0,\) or \(x=\lambda\). We consider both cases:

Case 1. \(x = \lambda\). Then the first equation implies that $1-y^2 +2x^2=0 $. Then the constraint imples that \(x=0\). \[ \text{solutions of the Lagrange multiplier equations: } \lambda = 0, \mathbf x = (0,\pm 1). \]

Case 2. \(y=0\). Then the constraint implies that \(x=\pm 1\). We can then solve the first equation for \(\lambda\) (if we care). This leads to \[ \text{solutions of the Lagrange multiplier equations: } \mathbf x = (\pm 1, 0), \ \lambda = \mp \frac 12. \]

Overall, the candidate solutions of the extreme value problem are \((\pm1, 0)\) and \((0,\pm 1)\), and this set of points must include the points where the min and max are attained. By evaluating \(f\) at these points, we find that the minimum is \(-1\), and the only global minimum point is $(-1,0), while the maximum is \(1\), and the only global maximum point is \((1,0)\).

Example 3.

Solve the problem \[ \left. \begin{array}{rl} \text{minimize/maximize}\ \ &x(1-y^2)\\ \text{ subject to the constraint:}\ \ &(x^2+ y^2 - 1)^2 = 0. \end{array}\right. \]

Solution? Note that this is exactly the same problem as in Example 2, since we are considering the same function \(f\), and in both cases the constraint says that we are minimizing/maximizing over the unit sphere \[ \{ (x,y)\in \R^2 : x^2+y^2=1 \}. \] The only difference is that in this version of the problem, the constraint is given in a complicated way. Let’s see what happens when we try to solve it: The Lagrange multiplier equations are \[\begin{align*} (1-y^2) + \lambda 4x(x^2+y^2-1) &= 0 \\ -2xy+ \lambda 4y(x^2+y^2-1) &=0 \\ x^2+y^2&=1. \end{align*}\]

Now, using the last equation, we can simplify the first two equations, rewriting them as \[\begin{align*} (1-y^2) &= 0 \\ -2xy&=0. \end{align*}\] The only solutions are \(\mathbf x = (0, \pm 1)\), and \(\lambda\) can be any real number. But \(f(\mathbf x) = 0\) at \(\mathbf x = (0,\pm 1)\), and \(0\) is neither the maximum nor the minimum value of \(f\) on the unit circle.

Thus, in this example, the Lagrange multiplier method does not work. The problem is that we have written the constraint in a silly way. Indeed, for the function \(g(\mathbf x) = (x^2+y^2-1)^2\), we can check that \(\nabla g(\mathbf x) = 0\) at every point where \(g(\mathbf x) = 0\), so condition \(\eqref{nondeg}\) is violated.

The next example is challenging.

Example 4.

Consider the problem \[ \left\{\begin{array}{r} \text{minimize }\ \ \ \frac{x+y}{1+x^2+y^2} \qquad \\ \text{ subject to the constraint: }\ \ x^2 + y^2 - R^2 = 0. \end{array}\right. \] This problem depends on \(R\); we will solve it for every possible choice of \(R\).

SolutionLet’s call the function we are minimizing “\(f\)”. We have already seen this function in Example 2 of Section 2.7, where we computed its derivatives and found its critical points. There we found that \[ \partial_x f(x,y) = \frac{1 - x^2 - 2xy +y^2 }{(1+x^2+y^2)^2}, \qquad \partial_y f(x,y) = \frac{1 + x^2 - 2xy -y^2 }{(1+x^2+y^2)^2}, \] Using this, the Lagrange multiplier equations are \[\begin{align*} \frac{1 - x^2 - 2xy +y^2 }{(1+x^2+y^2)^2}+ 2\lambda x &=0 \\ \frac{1 + x^2 - 2xy -y^2 }{(1+x^2+y^2)^2}+2\lambda y &= 0 \\ x^2+y^2 &= R^2 \end{align*}\] Using the third equation, we can rewrite the first two equations as \[\begin{align} 1 - x^2 - 2xy +y^2 + 2\lambda(1+R^2)^2 x &=0 \nonumber \\ 1 + x^2 - 2xy -y^2 +2\lambda (1+R^2)^2 y &= 0. \nonumber \end{align}\] Let’s simplify things by writing \(K = \lambda (1+R^2)^2\). Then by subtracting the second equation above from the first and rearranging, we get \[\begin{equation}\label{h1}0 = y^2 - x^2 - K(y-x) = (y+x-K)(y-x) \end{equation}\] On the other hand , by adding the two equations and rearranging, we get \[\begin{equation}\label{h2} K(x+y) - 2xy + 1 = 0. \end{equation}\] Now we can solve the system \(\eqref{h1}\), \(\eqref{h2}\): Clearly, \(\eqref{h1}\) is satisfied if either \(x=y\) or \(x+y=K\).

By evaluating \(f\) we conclude that \((x,y) = \left(-\frac R{\sqrt 2},-\frac R{\sqrt 2}\right)\) minimizes \(f\) with the constraint \(g(\mathbf x)=0\), and \(f\left(-\frac R{\sqrt 2},-\frac R{\sqrt 2}\right) = \frac {-\sqrt 2 R}{1+R^2}\).

Rewriting constraints as a domain of definition

Sometimes constrained minimization problems can be reduced to unconstrained problems by parametrizing the set of points where the constraint is satisfied. Instead of minimizing the function \(f:S\to \R\) with a constraint \(g(\mathbf x)=0\), we define the set \(T=\{\mathbf x : g(\mathbf x)=0\), and minimize \(f:T\to \R\) by restricting \(f\) to the subset \(T\subseteq S\). We illustrate this with an example.

Example 4 revisited.

Solution!Note that a point satisfies \(x^2+y^2 = R^2\) if and only if it can be written in the form \((R\cos\theta, R\sin \theta)\) for some \(\theta\in [0,2\pi)\).

So the problem of minimizing \(f\) subject to the constraint \(x^2+y^2 = R^2\) is equivalent to the problem of minimizing \(f(R\cos\theta, R\sin \theta)\) over all \(\theta\in [0,2\pi)\). Also, from the formula for \(f\), it is clear that \[ f(R\cos\theta, R\sin \theta) = \frac{R}{1+R^2} (\cos\theta+\sin \theta). \] Now we can apply the single variable optimization, where the derivative with respect to \(\theta\) is \(0\) when \(\sin\theta=\cos\theta\). Then by find the values of \(x,y\) and \(f\) at \(\theta=\frac \pi 4\) and \(\theta=\frac{5\pi}4\), we get the same global minimum as the previous solution.

Remark

Applications often involve high-dimensional problems, and the set of points satisfying the constraint may be very difficult to parametrize. Hence, the Lagrange multiplier technique is used more often. If you are programming a computer to solve the problem for you, Lagrange multipliers are straightforward to program.

Even if you are solving a problem with pencil and paper, for problems in \(3\) or more dimensions, it can be awkward to parametrize the constraint set, and therefore easier to use Lagrange multipliers.

Lagrange multipliers have a lot of theoretical power. For example, they can be used to prove that real symmetric matrices are diagonalizable.

Problems with multiple constraints

One can also use the Lagrange mutiplier method to address problems with more than one constraint. We will write it down for problems with \(2\) constraints, which have the form \[\begin{equation}\label{mt1c} \left\{\begin{array}{r} \text{minimize/maximize} \ \ f(\mathbf x),\qquad \\ \text{ subject to the constraints: }\ \ g_1(\mathbf x)=0\ \\ \text{ and }\ \ g_2(\mathbf x)=0. \end{array}\right. \end{equation}\] (It will be clear how to generalize it to problems with \(k\) constraints, if one wishes to do so.)

As with \(\eqref{nondeg}\) in the case of a single constraint, there is an annoying condition that limits the applicability of the method, see \(\eqref{nd2}\) below. In many problems one does not need to worry about it.

Suppose that \(f, g_1\) and \(g_2\) are functions \(\R^n\to \R\) of class \(C^1\). Suppose also that \[\begin{equation}\label{nd2} \{\nabla g_1(\mathbf x), \nabla g_2(\mathbf x)\}\text{ are linearly independent at all $\mathbf x$ where $g_1(\mathbf x) = g_2(\mathbf x) = 0$.}% \ne 0 \text{ on the set } \{ \mathbf x\in \R^n : g(\mathbf x) = 0\}. \end{equation}\] Then if \(\mathbf x\) is any solution of \(\eqref{mt1c}\), there exists \(\lambda_1,\lambda_2\in \R\) such that the following system of equations is satisfied by \(\mathbf x, \lambda_1\) and \(\lambda_2\): \[\begin{equation} \left\{ \begin{array} {r} \nabla f(\mathbf x) +\lambda_1 \nabla g_1(\mathbf x)+\lambda_2 \nabla g_2(\mathbf x) = \bf0,\\ g_1(\mathbf x) = 0\ \\ g_2(\mathbf x) = 0. \end{array} \right. \label{lm1a}\end{equation}\]

Note that the Lagrange multiplier equations \(\eqref{lm1a}\) are now a system of \(n+2\) equations, with \(n+2\) unknowns, \(x_1,\ldots, x_n\) and \(\lambda_1,\lambda_2\).

Example 5.

\[ \left\{\begin{array}{r} \text{minimize } \ \ \ \quad \ xy+ xz+yz \\ \text{ subject to the constraints: }\ \ x^2+y^2+z^2 = 2\\ \text{ and }\qquad\qquad\ \ \ \ \ z = 1. \end{array}\right. \]

Solution The set defined by the constraints is compact, and \(\{ \nabla g_1(\mathbf x), \nabla g_2(\mathbf x)\}\) are linearly independent except when \(x=y=0\). So they are always linearly independent at points where the constraint is satisfied. Thus a minimum point is guaranteed to exist, and the Lagrange multiplier equations are guaranteed to hold at the minimum point.

The Lagrange multiplier equations are: \[\begin{align} \left(\begin{array}{c} y+z \\ x+z \\ x+y \end{array} \right) +\lambda_1 \left(\begin{array}{c} 2x\\2y\\2z\end{array} \right) +\lambda_2 \left(\begin{array}{c} 0\\0\\1\end{array} \right) &= \left(\begin{array}{c} 0\\0\\0\end{array} \right) \\ x^2+y^2+z^2 &=2\\ z&=1. \end{align}\] This is a system of 5 equations with 5 unknowns. Fortunately, the equation \(z=1\) is trivial and allows us to eliminate \(z\) from the other equations, which then become a system of 4 equations and 4 unknowns: \[\begin{align} y+1+2\lambda_1 x &=0,\nonumber \\ x+1+2\lambda_1 y &=0.\nonumber \\ x+y+2\lambda_1+\lambda_2 &=0.\nonumber \\ x^2+y^2&=1\nonumber \end{align}\] Note that \(\lambda_2\) appears only in the third equation, so if the other three equations determine the other three unknowns \(x,y,\lambda_1\), then the third equation can always be satisfied by choosing \(\lambda_2 = -x-y-2\lambda_1\). So we can focus on the system \[\begin{align} y+1+2\lambda_1 x &=0,\nonumber \\ x+1+2\lambda_1 y &=0.\nonumber \\ x^2+y^2&=1\nonumber \end{align}\]

We multiply the second equation by \(2\lambda_1\) and use the first equation to eliminate \(x\), yielding \[ -1-y + 4\lambda_1^2 y = -2\lambda_1, \] which we can rewrite as \[ 0 = (4\lambda_1^2-1)y + (2\lambda_1-1) = (2\lambda_1-1)\bigg( (2\lambda_1+1)y+1\bigg). \]

This says that \[ \text{EITHER }\ \ \lambda_1=\frac 12 \qquad\text{ OR}\ \ (2\lambda_1+1)y = -1\qquad \] We consider both cases. If \(\lambda_1= \frac 12\), then the equations become \[ x+y+1 = 0, \qquad x^2+y^2 = 1. \] This has two solutions, \((x,y) = (-1,0)\) or \((x,y) = (0,-1)\).

If \(\lambda_1\ne \frac 12\), then \(y= -1/(2\lambda_1+1)\). Also, one could go through exactly the same argument as above, with the roles of \(x\) and \(y\) reversed, to find that \[ x=-1/(2\lambda_1+1) \qquad\text{ also}. \] Thus \(x=y\) in this case, and it follows from the constraint that the only solutions of the Lagrange multiplier equations are \((x,y) = \pm \left(\frac 1{\sqrt 2}, \frac 1 {\sqrt 2}\right)\).

In summary, the solutions of the Largange multiplier equations are:

So we can see that the minimum value is \(-1\), and it occurs at \((-1,0)\) and \((0,-1)\).

More optimization problems

Minimizing or maximizing a function in an open set

Suppose that \(S\) is an open subset of \(\R^n\), and that \(f:S\to \R\) is continuous, and maybe \(C^1\) or \(C^2\). Consider the problem minimizing \(f\) in \(S\).

  1. There may be no solution, depending on \(f\) and \(S\).

  2. We can guarantee there is a solution by showing the minimum must be contained in a compact set. See Example 3 and the problems in Section 1.4.

We know that a maximum/minimum point in an open set is always a critical point, so we can find all candidates by finding all of the critical points in the open set. We can also try to classify the critical points (determine which are local max/local mins) to get more information.

Inequality constraints

Finally, consider the problem \[\begin{equation}\label{ineqc} \left\{\begin{array}{r} \text{minimize/maximize }\ \ \ f(\mathbf x)\qquad \\ \text{ subject to the constraint: }\ \ g(\mathbf x)\le 0. \end{array}\right. \end{equation}\] where we suppose that \(g\) is \(C^2\), say, and that \(\nabla g(\mathbf x) \ne {\bf 0}\) on the set \(\{ \mathbf x\in \R^n : g(\mathbf x)=0\}\).

We can reduce this to problems we already know how to solve.

For problems such as \(\eqref{ineqc}\),the Extreme Value Theorem may guarantee that the problem has a solution (this depends on \(f\) and \(g\)). How can we find them? There are exactly 2 cases:

Case 1. The max or min occurs in the set \(\{ \mathbf x \in \R^n : g(\mathbf x)<0\}\).

Then it is a critical point, which we know how to find and classify.

Case 2. The max or min occurs in the set \(\{ \mathbf x \in \R^n : g(\mathbf x)= 0\}\).

Then we can find it by the Lagrange multipler technique.

Since one of these two cases must hold if the min or max is attained, we can:

  1. Find all critical points of \(f\) in \(\{ \mathbf x \in \R^n : g(\mathbf x)<0\}\). Find the max or min of \(f\) among these critical points.
  2. Use the Lagrange multiplier technique to find the max or min of \(f\) with the constraint \(g(\mathbf x)= 0\).
  3. Choose the smallest / largest value of \(f\) (and the point where that value is attained) from among all the candidates found in steps 1 and 2.

This situation is analogous to minimization problems in single variable calculus, with a function defined on a closed interval \([a,b]\). You had to consider two cases:

Case 1. The min occurs in the interior. To address this possibility, you can find all critical points in the interior \((a,b)\), and if you like you can also use a second derivative test to get more information. Case 2. The minimum occurs at the boundary, ie in the set \(\{a,b\}\). As in the multi-variable case, this requires separate consideration. It is easier for functions of a single variable, because we only have to worry about the two points \(a\) and \(b\). For functions of several variables this is where we need Lagrange multipliers (or some other technique).

There are more sophisticated ways of solving problems of this type, or problems with more than one inequality constraint, but we will not discuss them in this class.

Example 6.

Modify Example 1 (minimize/maximize \(f(x,y) = y\), with \(g\) described by the picture above), but with the constraint changed to \(g(x,y)\le 0\), where we suppose that \(g<0\) in the region enclosed by the curve, and \(g>0\) outside the curve.

SolutionHere \(\nabla f = (0,1)\) everywhere, so there are no critical points of \(f\) in the set where \(g<0\). Thus this problem has the same solutions as Example 1.

Example 7.

Consider the modification of Example 4: \[ \left\{\begin{array}{r} \text{minimize }\ \ \ \frac{x+y}{1+x^2+y^2} \qquad \\ \text{ subject to the constraint: }\ \ x^2 + y^2 - R^2 \le 0. \end{array}\right. \] This problem depends on \(R\); we will solve it for every possible choice of \(R\).

SolutionWe follow the procedure discussed above.

Step 1. Find all critical points \((x,y)\) such that \(x^2+y^2 < R^2\).

In fact, we already found all critical points (on all of \(\R^2\)) for this function in Example 2 of Section 2.7. There we found that the only critical points are \[ (x,y) = \pm \left( \sqrt{1/2}, \sqrt{1/2} \right). \] These satisfy \(x^2+y^2 < R^2\) if and only if \(R>1\)

Step 2. Solve the Lagrange multiplier equations for the equality constraint. We have already done this in Example 4 above. There we found that the solutions are \(\left(\frac R{\sqrt 2},\frac R{\sqrt 2}\right)\) (the maximum point) and \(\left(-\frac R{\sqrt 2},-\frac R{\sqrt 2}\right)\) (the minimum point).

Step 3. Combine the candidate solutions from parts 1 and 2. We see that there are two cases:

  1. If \(R\le 1\), then there are no interior critical points, and so the minimizer with the inequality constraint \(g\le 0\) must be the same as the minimizer with the equality constraint, \(g=0\). Thus

    • if \(R\le 1\), then the minimum occurs at \((x,y) = \left(-\frac R{\sqrt 2},-\frac R{\sqrt 2}\right)\).
  2. If \(R>1\), then there are both interior critical points and solutions of the Lagrange multiplier problem. By evaluating \(f\) at the various points, we conclude that

    • if \(R > 1\), then the minimum occurs at \((x,y) = \left(-\frac 1{\sqrt 2},-\frac 1{\sqrt 2} \right)\).

Problems

Basic

You will need to solve problems of the form \[\begin{equation} \left\{\begin{array}{r} \text{minimize/maximize }\ \ \ f(\mathbf x)\qquad \\ \text{ subject to the constraint: }\ \ g(\mathbf x)=0. \end{array}\right. \label{con1a}\end{equation}\]

You might be specifically asked to use the Lagrange multiplier technique to solve problems of the form \(\eqref{con1a}\). This gives you the opportunity to demonstrate that you can apply this tool to a solve a problem, without additionally having to select the tool.

Techniques such as Lagrange multipliers are particularly useful when the set defined by the constraint is compact. Here are some sample problems.

  1. Minimize \(f(x,y,z) = x^2+2y^2+4z^2\), subject to the constraint \[ x^2+y^2+z^2 = 1 \]

  2. Given \(\mathbf y\in \R^n\) and \(\mathbf v\in \R^n,b\in \R\), minimize \(f(\mathbf x) = |\mathbf x-\mathbf y|^2\) subject to the constraint \[ \mathbf v \cdot \mathbf x - b = 0 \] What is the minimum value? (This says: find the closest point to \(\mathbf y\) in the “hyperplane” \(\{ \mathbf x\in \R^n : \mathbf v \cdot\mathbf x = b\}\). This can be solved by geometric reasoning, but it is instructive to solve with Lagrange multipliers.)

  3. For points \(\mathbf x = (x,y)\) and \(\mathbf u = (u,v)\) in \(\R^2\), minimize \(f(\mathbf x, \mathbf u) = |\mathbf x - \mathbf u|\) among all pairs of points such that \(\mathbf x\) belongs to the plane \(\{ \mathbf x\in \R^2 : \mathbf x \cdot (1,2) = -10\}\) and \(\mathbf u\) belongs to the parabola \(\{ \mathbf u \in \R^2 : v = u^2\}\).

    Hints. We suggest that you only click after trying to get started. It is easier to minimize \(|\mathbf x - \mathbf u|^2\) than \(|\mathbf x - \mathbf u|\).
    Also, you may be able to simplify this problem by using the fact that you know a formula for the distance from a point to a plane, if you have solved the previous problem.

  4. Find the maximum volume of a rectangular box in \(\R^3\), with sides parallel to the coordinate axes, whose vertices are all a distance \(R\) from the origin.

  5. Redo the same question for an \(n\)-dimensional box in \(\R^n\), where the volume is of course the product of the lengths of the sides.

  6. Heron’s Formula for the area of a triangle with sides of length \(x,y,z\) is \[ \text{ area } = \sqrt{ s(s-x)(s-y)(s-z)},\quad\text{ where }s = \frac 12(x+y+z). \] Prove that a triangle that maximizes area for given perimeter is equilateral.

  7. Find the largest and smallest values of \(f(x,y,z) = x+2y+3z\) on the set of points such that \(x^2+y^2 = 1\) and \(y-z=2\).

  8. Find the largest and smallest values of \(x^3+y^3+z^3\) on the set of points where \(x^2+y^2+z^2=1\).

  9. Find the largest and smallest values of \(x^3+y^3+z^3\) on the set of points where \(x^4+y^4+z^4=1\).

You will also be asked to show that the function \(f = ...\) has a absolute minimum/maximum on the noncompact set \(S = ...\) (see Section 1.4) and find it. This involves combining ideas from Section 1.4 and techniques for finding and classifying critical points, as in Section 2.7.

  1. Consider the problem \[ \left\{\begin{array}{r} \text{minimize/maximize }\ \ \ 2x^2+y^2+z^2\qquad \\ \text{ subject to the constraint: }\ \ xyz-16=0. \end{array}\right. \] with the additional constraint that \(x,y,z\) are all positive. Prove that a minimizer exists, and find it.

  2. Some of the sample problems above involve optimization on noncompact sets. Determine which problems have this character, and prove that the optimization problem has a solution – that the minimum or maximum (whatever the problem asks about) is attained. Then solve the problem, if you didn’t already do it.

  3. Given points \((x_1,y_1),\ldots, (x_k, y_k)\in \R^2\) with \(x_i\ne x_j\) for \(i\ne j\), consider the function \[ f(a,b) = \sum_{j=1}^k (y_j - ax_j - b)^2 \]
    This is a measure of how close the line \(\ell(x) = ax+b\) comes to passing through the points \((x_j, y_j)\). Show that the the absolute minimum of \(f\) is attained on \(\R^2\), and prove that it is given by \[ a = \frac { k^{-1}\sum_{j=1}^k x_jy_j - \bar x \bar y} { k^{-1}\sum_{j=1}^k x_j^2 - \bar x^2}, \quad b = \bar y - a \bar x \] where \[ \bar x = \frac 1 k \sum_{j=1}^k x_j, \qquad \bar y = \frac 1 k \sum_{j=1}^k y_j. \] We may ask you to solve an optimization problem of the form \[ \left\{\begin{array}{r} \text{minimize/maximize }\ \ \ f(\mathbf x)\qquad \\ \text{ subject to the constraint: }\ \ g(\mathbf x)\le 0. \end{array}\right. \] However, these problems typically take a while to solve, so in this class we are not very likely to ask such a question in a situation where you are faced with rigid time constraints.

  4. Change \(g = 0\) to \(g \le 0\) on some of the above practice problems involving equality constraints.

Advanced

  1. Referring to the picture below, suppose that \(g:\mathbb R^2 \to \mathbb R\) is a differentiable function such that \(\{(x,y):g(x,y)=0\}\) is the curve, and \(\nabla g \ne {\bf 0}\) on the curve. drawing Consider the problem \[ \left\{\begin{array}{r} \text{minimize/maximize }\ \ \ f(\mathbf x) = |\mathbf x - {\bf P}|^2 \\ \text{ subject to the constraint: }\ \ g(\mathbf x)=0.\qquad\ \ \end{array}\right. \]

\(\Leftarrow\)  \(\Uparrow\)  \(\Rightarrow\)

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Canada License.