2.6: Taylor's Theorem

$\newcommand{\R}{\mathbb R }$ $\newcommand{\N}{\mathbb N }$ $\newcommand{\Z}{\mathbb Z }$ $\newcommand{\bfa}{\mathbf a}$ $\newcommand{\bfb}{\mathbf b}$ $\newcommand{\bfc}{\mathbf c}$ $\newcommand{\bff}{\mathbf f}$ $\newcommand{\bfg}{\mathbf g}$ $\newcommand{\bfG}{\mathbf G}$ $\newcommand{\bfh}{\mathbf h}$ $\newcommand{\bfu}{\mathbf u}$ $\newcommand{\bfx}{\mathbf x}$ $\newcommand{\bfp}{\mathbf p}$ $\newcommand{\bfy}{\mathbf y}$ $\newcommand{\ep}{\varepsilon}$

Taylor's Theorem

Review of Taylor's Theorem in $1$ dimensionm

Taylor's Theorem in higher dimensions

The quadratic case

More about the general case (optional)

Problems

Review of Taylor's Theorem in $1$ dimension

Assume that $I\subset \R$ is an open interval and that $f:I\to \R$ is a function of class $C^k$ on $I$.

For a point $a\in I$, the $k$th order Taylor polynomial of $f$ at $a$ is the unique polynomial of order at most $k$, denoted $P_{a,k}(h)$ such that \begin{align} f(a) &= P_{a,k}(0) \nonumber \\ f'(a) &= P'_{a,k}(0) \nonumber\\ \vdots & \qquad \vdots \nonumber\\ f^{(k)}(a) &= P^{(k)}_{a,k}(0). \nonumber \end{align}

It is straightforward to check that \begin{align} P_{a,k}(h) &= f(a) + h f'(a) + \frac {h^2}{2}f''(a) + \cdots + \frac {h^k}{k!} f^{(k)}(a) \\ &= \sum_{j=0}^k \frac {h^j}{j!} f^{(j)}(a) \nonumber \end{align}

Taylor's Theorem guarantees that $P_{a,k}(h)$ is a very good approximation of $f(a+h)$, and that the quality of the approximation increases as $k$ increases.
Here is a precise statement:

Theorem 1: Taylor's Theorem in 1 dimension Assume that $I\subset \R$ is an open interval and that $f:I\to \R$ is a function of class $C^k$ on $I$. For $a\in I$ and $h\in \R$ such that $a+h\in I$, let $P_{a,k}(h)$ denote the $k$th-order Taylor polynomial at $a$, and define the remainder $$ R_{a,k}(h) := f(a+h) - P_{a,k}(h). $$ Then $$ \lim_{h\to 0}\frac{R_{a,k}(h)}{h^k} = 0. $$

The case $k=2$.

The most important case (after the linear case, $k=1$, which follows directly from the definition of the derivative) is the quadratic case $k=2$. We now consider this in more detail. In this case, Taylor's Theorem relies on the following.

Proposition 1. Assume that $I\subset \R$ is an open interval and that $f:I\to \R$ is a function of class $C^2$ on $I$. For $a\in I$ and $h\in \R$ such that $a+h\in I$, there exists some $\theta\in (0,1)$ such that \begin{equation}\label{ttlr} f(a+h) = f(a) + hf'(a) + \frac {h^2}2 f''(a+\theta h). \end{equation}

This can be considered to be a second-order Mean Value Theorem.

We can see that this implies the $k=2$ case of Taylor's Theorem, since using Proposition 1 we have \begin{align} R_{a,2}(h) &= f(a+h) - [ f(a) + h f'(a) +\frac {h^2}2f''(a)] \nonumber \\ &= \frac {h^2}2 [ f''(a+\theta h) - f''(a)]. \nonumber \end{align} Thus $$ \frac{R_{a,2}(h)}{h^2} = \frac 12[ f''(a+\theta h) - f''(a)] $$ which tends to $0$ as $h\to 0$, since $f''$ is continuous by assumption.

The proof of Proposition 1 is harder.

Proof of Proposition 1

Step 1. First, to make things as easy as possible, let's assume that $f'(a)=0$ and that $h$ is a point such that $f(a+h)= f(a)$. (We will consider the general case later.) Then the statement we want to prove, see \eqref{ttlr}, reduces to the following: \begin{equation}\label{sc1} \mbox{ if }f(a) = f'(a) = f(a+h) = 0, \ \ \mbox{ then } \ \exists \theta\in (0,1) \mbox{ such that }f''(a+\theta h) = 0. \end{equation}

We will establish this using use Rolle's Theorem, which we recall is a special case of the $1$-d Mean Value Theorem. It implies that if $g$ is differentiable on an interval $(c,d)$, and if both $a$ and $a+h$ are points in $(c,d)$ such that $g(a)=g(a+ h)$, then there exists $\alpha\in (0,1)$ such that $g'(a+\alpha h)=0$.

Now, since $f(a) = f(a+h)$, Rolle's Theorem implies that there is a some $\theta_1\in (0,1)$ such that $f'(a+\theta_1 h) = 0$.

Next, note that $f'(a) = 0$ by hypothesis, and we have just shown that $f'(a+\theta_1h)= 0$. So we can apply Rolle's Theorem again, this time to $f'$, and with $\theta_1 h$ in place of $h$, to find that there exists some $\theta_2\in (0,1)$ such that $f''(a+ \theta_2\theta_1 h)= 0$. If we define $\theta = \theta_2\theta_1$, then this is exactly \eqref{sc1}. So we have fnished Step 1.

Step 2: the general case Now given $f$ of class $C^2$ in $I$ and points $a$ and $a+h \in I$, we want to modify $f$ to reduce to the special case from Step 1. We start by defining $$ g_1(x) = f(x) - f(a) - (x-a) f'(a). $$ Then $g_1(a) = g_1'(a) = 0$, but $g_1(a+h) \ne 0$ in general. To fix this, we define $$ g_2(x) = g_1(x) - (\frac {(x-a)}h)^2 g_1(a+h) $$ Then $$ g_2(a) = g_1(a)= 0, \quad g'_2(a) = g'_1(a)= 0, \quad g_2(a+h) = 0. $$ (It is also easy to see that $g_2$ is $C^2$.) Then by applying Step 1 to $g_2$, we find that there exists some $\theta\in (0,1)$ such that $g_2''(a+\theta h) = 0$. If you work what this means in terms of $f$, it says exactly that \eqref{ttlr} holds. $\quad \Box$

The general case (very much optional!)

For completeness, we outline the proof of Taylor's Theorem for $k\ge 3$.

Click here only if interested.

First we need the following generalization of Proposition 1.

Proposition 1, general version. Assume that $I\subset \R$ is an open interval and that $f:I\to \R$ is a function of class $C^k$ on $I$. For every $a\in I$ and $h\in \R$ such that $a+h\in I$, there exists $\theta\in (0,1)$ such that $$ f(a+h) = f(a) + h f'(a) + \frac {h^2}{2}f''(a) + \cdots + \frac {h^{k-1}}{(k-1)!} f^{(k-1)}(a) + \frac {h^k}{k!} f^{(k)}(a+\theta h). $$

Once this is known, it follows that $$ \frac 1{h^k}R_{a,k}(h) = \frac 1{k!}[ f^{(k)}(a+\theta h) - f^{(k)}(a)], $$ and the right-hand side tends to $0$ as $h\to 0$, since $f^{(k)}$ is assumed to be continuous. The proof of this general version of Proposition 1 is simlar in spirit to the basic version, but more complicated. It too can be broken into $2$ steps.

Step 1. The special case when $f(a) = f'(a) = \cdots = f^{(k-1)}(a) = f(a+h)=0$. Then one must show the existence of some $\theta\in (0,1)$ such that $f^{(k)}(a+\theta h)=0$.

Step 2. Reduction of the general case to Step 1. One does this by defining $$ g_1(x) = f(x) - [f(a) + (x-a)f'(a) + \cdots + \frac{(x-a)^{k-1}}{(k-1)!}f^{(k-1)}(a)], $$ (note that this is exactly $R_{a,k-1}(x-a)$, not a coincidence) and $$ g_2(x) = g_1(x) - (\frac {(x-a)} h)^k g_1(a+h). $$ This ends up implying the conclusion. Check the details if you like!

Taylor's Theorem in higher dimensions

Now assume that $S\subset \R^n$ is an open set and that $f:S\to \R$ is a function of class $C^k$ on $S$.

For a point $\bfa\in S$, the $k$th order Taylor polynomial of $f$ at $\bfa$ is the unique polynomial of order at most $k$, denoted $P_{\bfa,k}(\bfh)$ such that \begin{align}\label{tkRn} f(\bfa) &= P_{\bfa,k}({\bf 0}) \\ \partial^\alpha f(\bfa) &= \partial^\alpha P_{\bfa,k}({\bf 0})\ \ \ \mbox{ for all partial derivatives of order up to }k.\nonumber \end{align}

One of the main difficulties with the theory is just the notation needed to write down explicit formulas for $P_{\bfa,k}(\bfh)$. For $k\ge 3$, it gets either very complicated, or else simple but incomprehensible. For this reason we will focus below on the case $k=2$ (quadratic Taylor polynomials), which is both the most important and the simplest case. But first we will state the general result, which guarantees that $P_{\bfa,k}(\bfh)$ is a very good approximation of $f(\bfa+\bfh)$, and that the quality of the approximation increases as $k$ increases.

Theorem 2: Taylor's Theorem in $n$ dimensions. Assume that $S\subset \R^n$ is an open set and that $f:S\to \R$ is a function of class $C^k$ on $S$. For $\bfa\in S$ and $\bfh\in \R^n$ such that $\bfa+\bfh\in S$, let $P_{\bfa,k}(\bfh)$ denote the $k$th-order Taylor polynomial at $\bfa$, and define the remainder $$ R_{\bfa,k}(\bfh) := f(\bfa+\bfh) - P_{\bfa,k}(\bfh). $$ Then $$ \lim_{\bfh\to {\bf 0}}\frac{R_{\bfa,k}(\bfh)}{|\bfh|^k} = 0. $$

We will not present the proof, but the main ideas will be outlined below in the case $k=2$.

The quadratic case

A formula for $P_{\bfa, 2}(\bfh)$.

According to the definition we have given, the 2nd order Taylor polynomial $P_{\bfa, 2}(\bfh)$ of $f$ at $\bfa$ is the quadratic polynomial such that $f(\bfa) = P_{\bfa, 2}({\bf 0})$, and such that all first and second partial derivatives of $P_{\bfa, 2}(\bfh)$ at $\bf h = 0$ coincide with the first and second partial derivatives of $f$ at $\bfa$.

Proposition 2. \begin{align} P_{\bfa, 2}(\bfh) &= f(\bfa) + \sum_{i=1}^n h_i \partial_i f(\bfa) +\frac 12 \sum_{i,j=1}^n h_i h_j \partial_i\partial_j f(\bfa) \nonumber \\ & = f(\bfa) + \nabla f(\bfa)\cdot \bfh + \frac 12 (H(\bfa) \bfh)\cdot \bfh\label{Pa2} \end{align} where $H(\bfa)$ denotes the matrix of second derivaties of $f$ at $\bfa$: $$ H(\bfa) := n\times n \mbox{ matrix whose }(i,j)\mbox{ entry is } \partial_i\partial_j f(\bfa). $$

The proof is an exercise; click here to see how to get started.

This can be proved by considering a general quadratic polynomial $q$ in the variable $\bfh$. This can always be written in the form then $q$ can be written in the form $$ q(\bfh) = \frac 12 (A \bfh)\cdot \bfh + \bfb \cdot \bfh + c, $$ where $A$ is a symmetric $n\times n$ matrix with entries $(a_{ij})$, $\bfb\in \R^n$, and $c\in \R$. One can then differentiate $q$ and see what conditions the coefficients $A,\bfb, c$ must satisfy in order for all derivatives of order up to $2$ at $\bfh = \bf 0$ to agree with the corresponding derivatives of $f$ at $\bfa$. (See exercises.)

In view of its importance, we restate Taylor's Theorem in the case $k=2$ (including some extra details that we did not mention in the general case).

Theorem 3. the quadratic case of Taylor's Theorem. Assume that $S\subset \R^n$ is an open set and that $f:S\to \R$ is a function of class $C^2$ on $S$.
Then for $\bfa\in S$ and $\bfh\in \R^n$ such that the line segment connecting $\bfa$ and $\bfa+\bfh$ is contained in $S$, there exists $\theta\in (0,1)$ such that \begin{equation}\label{ttlr2} f(\bfa+\bfh) = f(\bfa) + \nabla f(\bfa)\cdot \bfh + \frac 12 (H(\bfa+\theta\bfh) \bfh)\cdot \bfh. \end{equation} As a result (see exercises), \begin{equation}\label{tt2} \lim_{\bfh\to {\bf 0}}\frac{R_{\bfa,2}(\bfh)}{|\bfh|^2} = 0, \quad \mbox{ for }R_{\bfa,2}(\bfh) = f(\bfa+\bfh) - P_{\bfa,2}(\bfh) \end{equation} where the formula for $P_{\bfa,2}(\bfh)$ is given in \eqref{Pa2}.

Sketch of the proof

The idea of the proof is this:

First, define $\phi(s) = f(\bfa + s \bfh) = f(\bfg(s))$ for $\bfg(s) = \bfa + s \bfh$. By assumption, the domain of $\phi$ is an open set that contains the interval $[0,1]$. Our assumptions and the chain rule also imply that $\phi$ is a function of class $C^2$.
Thus we can apply Proposition 1 to $\phi$ (with $a=0$ and $h=1$) to find that there exists some $\theta\in (0,1)$ such that \begin{equation}\label{ttlra2} \phi(1) = \phi(0) + \phi'(0) + \frac 12 \phi''(\theta). \end{equation}
We know that $\phi(1) = f(\bfa+\bfh)$ and $\phi(0) = f(\bfa)$. Moreover, we know from the Chain Rule that $\phi'(0) = \bfh\cdot \nabla f(\bfa)$. We can check using the Chain Rule that \begin{equation}\label{ttt} \phi''(\theta) = (H(\bfa+\theta\bfh) \bfh )\cdot \bfh. \end{equation} (see exercises). One obtains \eqref{ttlr2} by using these to rewrite \eqref{ttlra2}.

See the exercises for the details.

More about the general case (optional)

For completeness, we state the formula for the $k$th order Taylor polynomial, for arbitrary $k\in \N$.

First, recall that in Section 2.5 we introduced the notation \begin{equation}\label{mix} \partial^\alpha f = (\frac{\partial}{\partial x_1 })^{\alpha_1}(\frac{\partial}{\partial x_{2} })^{\alpha_{2}}\cdots (\frac{\partial}{\partial x_n })^{\alpha_n}f, \end{equation} where $\alpha$ is a mutli-index; that is $\alpha$ has the form $(\alpha_1,\ldots, \alpha_n)$, where each $\alpha_j$ is a nonnegative integer. For such a multi-index, we will also use the notation $$ \alpha! = \alpha_1!\alpha_2! \cdots \alpha_n!\ , \qquad \bfh^\alpha := h_1^{\alpha_1}h_2^{\alpha_2}\ldots h_n^{\alpha_n}. $$ With this notation, the Taylor series has the formula \begin{equation}\label{tkRn2} P_{\bfa, k}(\bfh) = \sum_{\{\alpha : |\alpha|\le k\}} \frac {\bfh^\alpha}{\alpha!} {\partial^\alpha f(\bfa)}. \end{equation} Recall that $|\alpha| = \alpha_1+\ldots + \alpha_n$ is the order of the multi-index $\alpha$. Thus the formula involves all derivatives of order up to $k$ (including $\alpha = (0,\ldots, 0)$.)

As in the quadratic case, thie idea of the proof of Taylor's Theorem is

Define $\phi(s) = f(\bfa+s\bfh)$.
Apply the $1$-dimensional Taylor's Theorem (or formula \eqref{ttlr}) to $\phi$.
Use the chain rule and induction (for example) to express the resulting facts about $\phi$ in terms of $f$. This is the hard part of the proof and involves showing that $$ \phi^{(j)}(s) = \sum_{\{\alpha : |\alpha| = j\}} \frac{h!}{\alpha!} \partial^\alpha f(\bfa+s\bfh) \quad\mbox{ for every }j\le k. $$

Problems

Basic skills

Compute the second-order Taylor polynomial $P_{\bfa, 2}$ of the function $f = \ldots$ at the point $\bfa = \ldots$.
This question is just asking you to

compute the first and second derivatives of $f$
evaluate them at $\bfa$, and
substitute into the formula \eqref{Pa2} for $P_{\bfa, 2}$. To do this, you will have either look up the formula or, if you find yourself in circumstances where to cannot look it up, remember it.

For example:

Compute the second-order Taylor polynomial of $f(x,y) = \frac{xy +y^2}{1+\cos^2 x}$ at $\bfa = (0,2)$.
Compute the second-order Taylor polynomial of $f(x,y,z) = xy^2e^{z^2}$ at the point $\bfa = (1,1,1)$.