2.4: the Mean Value Theorem

$\newcommand{\R}{\mathbb R }$ $\newcommand{\N}{\mathbb N }$ $\newcommand{\Z}{\mathbb Z }$ $\newcommand{\bfa}{\mathbf a}$ $\newcommand{\bfb}{\mathbf b}$ $\newcommand{\bfc}{\mathbf c}$ $\newcommand{\bff}{\mathbf f}$ $\newcommand{\bfg}{\mathbf g}$ $\newcommand{\bfG}{\mathbf G}$ $\newcommand{\bfh}{\mathbf h}$ $\newcommand{\bfu}{\mathbf u}$ $\newcommand{\bfx}{\mathbf x}$ $\newcommand{\bfp}{\mathbf p}$ $\newcommand{\bfy}{\mathbf y}$ $\newcommand{\ep}{\varepsilon}$

the Mean Value Theorem

The Mean Value Theorem

Some consequences

Problems

The Mean Value Theorem

Theorem 1. the Mean Value Theorem. Assume that $f$ is a real-valued function of class $C^1$ defined on an open set $S\subset\R^n$. For two points $\bfa,\bfb\in S$, let $L_{\bfa, \bfb}$ denote the line segment that connects them. If $L_{\bfa,\bfb}\subset S$, then there exists $\bfc\in L_{\bfa,\bfb}$ such that $$ f(\bfb)-f(\bfa) = (\bfb-\bfa)\cdot \nabla f(\bfc) $$

Proof.

For $s\in [0,1]$, let $\gamma(s) := s\bfb+(1-s)\bfa$. Note that $$ L_{\bfa,\bfb} = \{ \gamma(s) : 0\le s \le 1 \}. $$ Next, define $$ \phi(s) := f(\gamma(s)). $$ According to the single-variable Mean Value Theorem from MAT137, there exist $\sigma\in (0,1)$ such that $$ \frac{\phi(1) - \phi(0)}{1-0} = \phi'(\sigma). $$ Also, the Chain Rule implies that $\phi'(\sigma) = \nabla f(\gamma(\sigma))\cdot (\bfb-\bfa)$. So if we define $\bfc := \gamma(\sigma)$, then $\bfc\in L_{\bfa,\bfb}$, and since $\phi(0)=f(\bfa)$ and $\phi(1) = f(\bfb)$, the above identity becomes $$ f(\bfb) - f(\bfa) = \nabla f(\bfc)\cdot(\bfb-\bfa). \qquad \qquad\Box $$

Some consequences

The $1d$ Mean Value Theorem, familar from MAT137, is used to prove things like

if a function $f$ has the property that $f'(t)=0$ for all $t$ in an interval $(a,b)$, then $f$ is constant on $(a,b)$.
if a function $f$ has the property that $|f'(t)|\le M$ for all $t$ in an interval $(a,b)$, then the slope of $f$ between any two points is at most $M$. More precisely, $$ |f'(t)|\le M\mbox{ for all }t\in (a,b) \quad\Rightarrow \quad |f(t) - f(s)| \le M|t-s| \mbox{ for all }s,t\in (a,b). $$

In this section we will show how the Mean Value Theorem can be used to prove similar facts in higher dimensions.

First, we introduce a class of sets on which the Mean Value Theorem is particularly useful.

Definition 1: A set $S\subset \R^n$ is said to be convex if, for every $\bfa, \bfb\in S$, the line segment $L_{\bfa,\bfb}$ is contained in $S$. That is, $$ \forall \bfa,\bfb\in S,\forall s\in [0,1], \qquad s\bfb+ (1-s)\bfa \in S. $$

In other words, if $S$ is convex, then the geometric assumption in the Mean Value Theorem is satisfied for every pair of points $\bfa$ and $\bfb$ in $S$.

Example 1. A ball $B(r,\bfp)$ is convex.

The proof below is essentially copied from Section 1.5, where we proved that $B(r,\bfp)$ is path-connected. As you can see, the proof we gave there actually shows that it is convex.

Proof.

Let's write $$ \gamma(s) = \bfa + s(\bfb - \bfa) = (1-s) \bfa + s\bfb . $$ We have to show that $|\gamma(s) - \bfp|<r$ for all $s\in [0,1]$. In fact this is the case, because for $s\in [0,1]$, \begin{align*} |\gamma(s) - \bfp| &= |(1-s) \bfa + s\bfb - \bfp| &\mbox{(definiton of $\gamma(s)$)}\\ &= |(1-s) (\bfa-\bfp) + s(\bfb - \bfp)| &\mbox{(rewrite)}\\ &\le |(1-s) (\bfa-\bfp)| + |s(\bfb - \bfp)| &\mbox{(triangle ineq.)}\\ %&= %|1-s|\, |\bfa-\bfp| + |s|\ |\bfb - \bfp| &\mbox{(triangle ineq.)}\\ &< (1-s) r + s r = r &\mbox{ since $\bfa, \bfb \in B(r,\bfp)$}. \end{align*}

Thus $B(r,\bfp)$ is convex. $\qquad\qquad \Box$

Examples 2. Here are a number of other examples of convex sets. The proofs are execises.

A solid ellipsoid is convex. By this we mean a set of the form $$ S = \{ \bfx \in \R^n : (x_1/a_1)^2+ \cdots + (x_n/a_n)^2 \le 1\} $$ where $a_1,\ldots, a_n$ are nonzero constants.
An intersection of convex sets is convex. (A union of convex sets need not be convex; you can easily convince yourself of this by drawing a picture.)
A subspace of $\R^n$ is convex. In particular, the range and the nullspace of a matrix are both convex.
If $L:\R^n \to \R^m$ is a function of the form $$ L(\bfx) = A\bfx + \bfb\qquad\mbox{ where }A\mbox{ is an }m\times n\mbox{ matrix, and }\bfb\in \R^m $$ and if $S$ is a convex subset of $\R^n$, then $$ L(S) := \{L(\bfx) : \bfx\in S\}\quad\mbox{ is convex}. $$

Theorem 2. Assume that $S$ is an open, convex subset of $\R^n$ and that $f:\R^n\to \R$ is a function that is differentiable in $S$, and moreover that there exists $M\ge 0$ such that $|\nabla f(\bfx)|\le M$ for all $\bfx\in S$. Then for every $\bfa, \bfb\in S$, $$ |f(\bfb)- f(\bfa)| \le M |\bfb - \bfa|. $$

This is very similar to a standard application of the 1d-mean value theorem.

Proof.

Fix any $\bfa,\bfb\in S$. The Mean Value Theorem implies that there exists some $\bfc\in L_{\bfa,\bfb}\subset S$ such that $$ f(\bfb) - f(\bfa) = \nabla f(\bfc)\cdot (\bfb - \bfa). $$ Then Cauchy's inequality implies that $$ |f(\bfb) - f(\bfa)| = |\nabla f(\bfc)\cdot (\bfb - \bfa)| \le |\nabla f(\bfc)| \ |\bfb - \bfa|. $$ Our hypotheses imply that $|\nabla f(\bfc)|\le M$, so the conclusion of the theore follows.

Theorem 3. Assume that $S$ is an open, convex subset of $\R^n$ and that $f:\R^n\to \R$ is a function that is differentiable in $S$. If $\nabla f(\bfx )={\bf 0}$ for every $\bfx\in S$, then $f$ is constant on $S$.

This is the multi-variable version of a familiar theorem from first-year calculus: if $f'=0$ everywhere on an interval, then $f$ is constant on that interval. (Recall, the proof of that theorem uses the 1d version of the the mean value theorem.)

In fact, the hypothesis of convexity is much stronger than necessary, and it can be replaced by a much weaker geometric condition.

Theorem 4. Assume that $S$ is an open, path-connected subset of $\R^n$ and that $f:\R^n\to \R$ is a function that is differentiable in $S$. If $\nabla f(\bfx )={\bf 0}$ for every $\bfx\in S$, then $f$ is constant on $S$.

The proof is not very difficult, but it is a slightly sneaky.

Proof.

We need to show that if $\bfa, \bfb$ are any two points in $S$, then $f(\bfa) = f(\bfb)$. So, fix any $\bfa,\bfb$. By the hypothesis of path-connectedness, there exists $\gamma:[0,1]\to S$ that is continuous such that $\gamma(0)=\bfa$ and $\gamma(1)=\bfb$.

Define $\phi(s) = f(\gamma(s))$. We will show that $\phi'(s)=0$ for every $s\in (0,1)$. Note that we cannot use the chain rule, since we only know that $\gamma$ is continuous, not differentiable.

To do this, fix $s\in (0,1)$. Since $S$ is open, there exists $\ep>0$ such that $B(\ep,\gamma(s))\subset S$. Since $\gamma$ is continuous, there exists $\delta>0$ such that if $|h|<\delta$, then $s+h\in (0,1)$ and $|\gamma(s+h)-\gamma(s)|<\ep$. In other words, $$ |h|<\delta \quad\Rightarrow\quad \gamma(s+h)\in B(\ep,\gamma(s)) $$ However, $B(\ep,\gamma(s))$ is a convex open set on which $\nabla f = {\bf 0}$ everywhere, so Theorem 3 implies that $f(\bfx) = f(\gamma(s))$ for every $\bfx\in B(\ep, \gamma(s))$. In particular, it follows that $$ |h|<\delta \quad \Rightarrow \quad \phi(s+h) - \phi(s) = f(\gamma(s+h)) - f(\gamma(s)) = 0. $$ It easily follows that $\phi'(s)=0$. Since $s$ was arbitrary, we conclude that $\phi'=0$ everywhere in $(0,1)$. Finally, the $1$-d Mean Value Theorem implies that $$ f(\bfb) - f(\bfa) = \phi(1)-\phi(0) = 0. $$

Problems

Basic skills

There are not really any Basic Skills connected to the material in this section.

the Mean Value Theorem