\(\renewcommand{\R}{\mathbb R }\)
\(\Leftarrow\) \(\Uparrow\) \(\Rightarrow\)
The Mean Value theorem of single variable calculus tells us that if we connect two points \((a, f(a))\) and \((b, f(b))\) with a straight line \(\ell\) on the graph of a differentiable function \(f\), then there is a point \(c\in [a,b]\) where the tangent line is parallel to \(\ell\), i.e. \[ f'(c)=\frac{f(b)-f(a)}{b-a}.\] To generalize this in higher dimensions, we must rewrite it as \((b-a)f'(c)=f(b)-f(a)\) so that we don’t divide by vectors. Then after replacing the derivative with the gradient and multiplication with the dot product, we have
Note that we require the line \(L_{\mathbf a, \mathbf b}\) to be contained in the domain of \(f\), and that it must be a line segment, not an arbitrary path. Try to explain where the fact that it is a line segment is used.
The single variable Mean Value Theorem is used to prove things like
if \(|f'(t)|\le M\) for all \(t\) in an interval \((a,b)\), then the slope between any two points on the graph of \(f\) is at most \(M\), that is, \[ |f(t) - f(s)| \le M|t-s| \text{ for all }s,t\in (a,b). \]
if \(f'(t)=0\) for all \(t\) in an interval \((a,b)\), then \(f\) is constant on \((a,b)\).
In this section we will show how the Mean Value Theorem can be used to prove similar facts in higher dimensions.
Since it was important that the domain of \(f\) contained an entire line segment between \(\mathbf a\) and \(\mathbf b\), we will name those sets where this holds for any two points.
In other words, if \(S\) is convex, then the geometric assumption in the Mean Value Theorem is satisfied for every pair of points \(\mathbf a\) and \(\mathbf b\) in \(S\).
A ball \(B(\mathbf p; r)\) is convex.
The proof is in Section 1.5, where we proved that \(B(\mathbf p; r)\) is path-connected. Since the path we described was the line segment between points, this showed it is also convex.
Here are a number of other examples of convex sets. The proofs are exercises.
A solid ellipsoid is convex. This is a set of the form \[ S = \{ \mathbf x \in \R^n : (x_1/a_1)^2+ \cdots + (x_n/a_n)^2 \le 1\} \] where \(a_1,\ldots, a_n\) are nonzero constants.
An intersection of convex sets is convex. A union of convex sets may not be convex; try to drawing an example of this in \(\R^2\).
Any subspace of \(\R^n\) is convex. Recall that a subspace is nonempty, and closed under vector addition and scalar multiplication. In particular, the range and the nullspace of a linear transformation are both convex.
If \(L:\R^n \to \R^m\) is an affine function, i.e., a function of the form \[ L(\mathbf x) = A\mathbf x + \mathbf b \]where \(A\) is an \(m\times n\) matrix and \(\mathbf b\in \R^m\), and if \(S\) is a convex subset of \(\R^n\), then the image \(L(S)\).
This generalizes the first single variable application that we gave.
This generalizes the second application we gave.
In this example, the hypothesis that \(S\) is convex is much stronger than necessary, and can be replaced by a weaker geometric condition.
The proof uses an \((\varepsilon, \delta)\) argument to apply Theorem 3 to a small ball around each point on a path.
We need to show that if \(\mathbf a, \mathbf b\) are any two points in \(S\), then \(f(\mathbf a) = f(\mathbf b)\). So, fix any \(\mathbf a,\mathbf b\). By the hypothesis of path-connectedness, there exists \(\gamma:[0,1]\to S\) that is continuous such that \(\gamma(0)=\mathbf a\) and \(\gamma(1)=\mathbf b\).
Define \(\phi(s) = f(\gamma(s))\). We will show that \(\phi(s)\) is constant, and hence \(\phi(0)=\phi(1)\). Note that we cannot use the chain rule, since we only know that \(\gamma\) is continuous, not differentiable, and even if it were differentiable, we do not know its derivative.
Fix \(s\in [0,1]\). Since \(S\) is open, there exists \(\varepsilon>0\) such that \(B(\gamma(s); \varepsilon)\subseteq S\). Since \(\gamma\) is continuous, there exists \(\delta>0\) such that if \(|h|<\delta\) and \(s+h\in [0,1]\), then \(\gamma(s+h) \in B(\gamma(s);\varepsilon)\).
Now we can apply Theorem 3 to \(B(\gamma(s); \varepsilon)\), since it is a convex open set on which \(\nabla f = \mathbf 0\) everywhere, hence \(f(\mathbf x) = f(\gamma(s))\) for every \(\mathbf x\in B(\varepsilon, \gamma(s))\). In particular, for all \(|h|<\delta\), \[ \phi(s+h) = f(\gamma(s+h)) = f(\gamma(s)) = \phi(s). \] Thus, \(\phi\) is constant, by using compactness of the interval \([0,1]\).You will need to recognize and apply Mean Value Theorem, and apply the definition of convex set.
Suppose that \(f:\R^n\to \R\) is a \(C^1\) function and that there exists a vector \({\bf v}\in \R^n\) such that \[ {\bf v}\cdot \nabla f(\mathbf x) = 0\qquad\text{ for all }\mathbf x\in \R^n. \] Prove that for every \(\mathbf x \in \R^n\) and every \(t\in \R\), \[ f(\mathbf x + t{\bf v}) = f(\mathbf x). \] That is, moving along a vector that is orthogonal to the gradient does not change the value of the function (like walking in a circle around a hill).
Prove that every convex set is path-connected.
Draw a picture of the following sets and determine whether they are convex
\(S = \{ (x,y)\in \R^2 : (x/2)^2+ (y/3)^2 \le 1\}\).
\(S = \{ (x,y)\in \R^2 : (x/2)^2- (y/3)^2 \le 1\}\).
\(S = \{ (x,y)\in \R^2 : y \ge e^{x} \}\).
\(S = \{ (x,y)\in \R^2 : x < e^{-y^2} \}\).
\(S = \{ (x,y)\in \R^2 : xy <1 \}\).
\(S = \{ (x,y)\in \R^2 : y> k - x/k^2 \text{ for all }k\in \mathbb N \}\).
Assume that \(S\) is an open subset of \(\R^2\), and that \(f:S\to \R\) is a differentiable function such that \(\partial_1 f = 0\) everywhere in \(S\).
If \(S\) is convex, is it true that \(f\) depends only on the \(y\) variable, in other words, that \(f(x,y)= f(x', y)\) whenever \((x,y)\) and \((x',y)\) belong to \(S\)?
Same question if \(S\) is not convex. As an example, try \[ S = \{ (x,y)\in \R^2 : 2x^2< y <1+x^2 \}. \]
Assume that \(S\) is a convex subset of \(\R^n\) and that \(f:\R^n\to \R^m\) is an affine function, i.e., of the form \[ f(\mathbf x) = A \mathbf x + \mathbf b \]for some \(m\times n\) matrix \(A\) and some \(b\in \R^m\). Prove that \(f(S) = \{ f(\mathbf x) : \mathbf x \in S\}\) is convex.
Prove that if \(S_1, S_2, \ldots,\) are convex sets, then
Prove that a set \(S\) of the form \[ S = \{ \mathbf x \in \R^n : (x_1/a_1)^2+ \cdots + (x_n/a_n)^2 \le 1\} \] is convex, where \(a_1,\ldots, a_n\) are nonzero constants.
Hint
Combining one of the exercises above and the unit ball in \(\R^n\).
Let \(g:\R^n\to [0,\infty)\) be a function that is homogeneous of degree \(1\), and such that \(g(\mathbf x+\mathbf y) \le g(\mathbf x) + g(\mathbf y)\) for all \(\mathbf x, \mathbf y\in \R^n\). Prove that \[ \{ \mathbf x\in \R^n : g(\mathbf x) < 1 \} \]is convex.
\(\Leftarrow\) \(\Uparrow\) \(\Rightarrow\)
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Canada License.