2.6: Taylor's Theorem

$\newcommand{\R}{\mathbb R }$ $\newcommand{\N}{\mathbb N }$ $\newcommand{\Z}{\mathbb Z }$ $\newcommand{\bfa}{\mathbf a}$ $\newcommand{\bfb}{\mathbf b}$ $\newcommand{\bfc}{\mathbf c}$ $\newcommand{\bff}{\mathbf f}$ $\newcommand{\bfg}{\mathbf g}$ $\newcommand{\bfG}{\mathbf G}$ $\newcommand{\bfh}{\mathbf h}$ $\newcommand{\bfu}{\mathbf u}$ $\newcommand{\bfx}{\mathbf x}$ $\newcommand{\bfp}{\mathbf p}$ $\newcommand{\bfy}{\mathbf y}$ $\newcommand{\ep}{\varepsilon}$

Taylor's Theorem

  1. Review of Taylor's Theorem in $1$ dimensionm
  2. Taylor's Theorem in higher dimensions
  3. The quadratic case
  4. More about the general case (optional)
  5. Problems

Review of Taylor's Theorem in $1$ dimension

Assume that $I\subset \R$ is an open interval and that $f:I\to \R$ is a function of class $C^k$ on $I$.

For a point $a\in I$, the $k$th order Taylor polynomial of $f$ at $a$ is the unique polynomial of order at most $k$, denoted $P_{a,k}(h)$ such that \begin{align} f(a) &= P_{a,k}(0) \nonumber \\ f'(a) &= P'_{a,k}(0) \nonumber\\ \vdots & \qquad \vdots \nonumber\\ f^{(k)}(a) &= P^{(k)}_{a,k}(0). \nonumber \end{align}

It is straightforward to check that \begin{align} P_{a,k}(h) &= f(a) + h f'(a) + \frac {h^2}{2}f''(a) + \cdots + \frac {h^k}{k!} f^{(k)}(a) \\ &= \sum_{j=0}^k \frac {h^j}{j!} f^{(j)}(a) \nonumber \end{align}

Taylor's Theorem guarantees that $P_{a,k}(h)$ is a very good approximation of $f(a+h)$, and that the quality of the approximation increases as $k$ increases.
Here is a precise statement:

Theorem 1: Taylor's Theorem in 1 dimension Assume that $I\subset \R$ is an open interval and that $f:I\to \R$ is a function of class $C^k$ on $I$. For $a\in I$ and $h\in \R$ such that $a+h\in I$, let $P_{a,k}(h)$ denote the $k$th-order Taylor polynomial at $a$, and define the remainder $$ R_{a,k}(h) := f(a+h) - P_{a,k}(h). $$ Then $$ \lim_{h\to 0}\frac{R_{a,k}(h)}{h^k} = 0. $$

The case $k=2$.

The most important case (after the linear case, $k=1$, which follows directly from the definition of the derivative) is the quadratic case $k=2$. We now consider this in more detail. In this case, Taylor's Theorem relies on the following.

Proposition 1. Assume that $I\subset \R$ is an open interval and that $f:I\to \R$ is a function of class $C^2$ on $I$. For $a\in I$ and $h\in \R$ such that $a+h\in I$, there exists some $\theta\in (0,1)$ such that \begin{equation}\label{ttlr} f(a+h) = f(a) + hf'(a) + \frac {h^2}2 f''(a+\theta h). \end{equation}

This can be considered to be a second-order Mean Value Theorem.

We can see that this implies the $k=2$ case of Taylor's Theorem, since using Proposition 1 we have \begin{align} R_{a,2}(h) &= f(a+h) - [ f(a) + h f'(a) +\frac {h^2}2f''(a)] \nonumber \\ &= \frac {h^2}2 [ f''(a+\theta h) - f''(a)]. \nonumber \end{align} Thus $$ \frac{R_{a,2}(h)}{h^2} = \frac 12[ f''(a+\theta h) - f''(a)] $$ which tends to $0$ as $h\to 0$, since $f''$ is continuous by assumption.

The proof of Proposition 1 is harder.

Proof of Proposition 1

Step 1. First, to make things as easy as possible, let's assume that $f'(a)=0$ and that $h$ is a point such that $f(a+h)= f(a)$. (We will consider the general case later.) Then the statement we want to prove, see \eqref{ttlr}, reduces to the following: \begin{equation}\label{sc1} \mbox{ if }f(a) = f'(a) = f(a+h) = 0, \ \ \mbox{ then } \ \exists \theta\in (0,1) \mbox{ such that }f''(a+\theta h) = 0. \end{equation}

We will establish this using use Rolle's Theorem, which we recall is a special case of the $1$-d Mean Value Theorem. It implies that if $g$ is differentiable on an interval $(c,d)$, and if both $a$ and $a+h$ are points in $(c,d)$ such that $g(a)=g(a+ h)$, then there exists $\alpha\in (0,1)$ such that $g'(a+\alpha h)=0$.

Now, since $f(a) = f(a+h)$, Rolle's Theorem implies that there is a some $\theta_1\in (0,1)$ such that $f'(a+\theta_1 h) = 0$.

Next, note that $f'(a) = 0$ by hypothesis, and we have just shown that $f'(a+\theta_1h)= 0$. So we can apply Rolle's Theorem again, this time to $f'$, and with $\theta_1 h$ in place of $h$, to find that there exists some $\theta_2\in (0,1)$ such that $f''(a+ \theta_2\theta_1 h)= 0$. If we define $\theta = \theta_2\theta_1$, then this is exactly \eqref{sc1}. So we have fnished Step 1.

Step 2: the general case Now given $f$ of class $C^2$ in $I$ and points $a$ and $a+h \in I$, we want to modify $f$ to reduce to the special case from Step 1. We start by defining $$ g_1(x) = f(x) - f(a) - (x-a) f'(a). $$ Then $g_1(a) = g_1'(a) = 0$, but $g_1(a+h) \ne 0$ in general. To fix this, we define $$ g_2(x) = g_1(x) - (\frac {(x-a)}h)^2 g_1(a+h) $$ Then $$ g_2(a) = g_1(a)= 0, \quad g'_2(a) = g'_1(a)= 0, \quad g_2(a+h) = 0. $$ (It is also easy to see that $g_2$ is $C^2$.) Then by applying Step 1 to $g_2$, we find that there exists some $\theta\in (0,1)$ such that $g_2''(a+\theta h) = 0$. If you work what this means in terms of $f$, it says exactly that \eqref{ttlr} holds. $\quad \Box$

The general case (very much optional!)

For completeness, we outline the proof of Taylor's Theorem for $k\ge 3$.

Click here only if interested.

First we need the following generalization of Proposition 1.

Proposition 1, general version. Assume that $I\subset \R$ is an open interval and that $f:I\to \R$ is a function of class $C^k$ on $I$. For every $a\in I$ and $h\in \R$ such that $a+h\in I$, there exists $\theta\in (0,1)$ such that $$ f(a+h) = f(a) + h f'(a) + \frac {h^2}{2}f''(a) + \cdots + \frac {h^{k-1}}{(k-1)!} f^{(k-1)}(a) + \frac {h^k}{k!} f^{(k)}(a+\theta h). $$

Once this is known, it follows that $$ \frac 1{h^k}R_{a,k}(h) = \frac 1{k!}[ f^{(k)}(a+\theta h) - f^{(k)}(a)], $$ and the right-hand side tends to $0$ as $h\to 0$, since $f^{(k)}$ is assumed to be continuous. The proof of this general version of Proposition 1 is simlar in spirit to the basic version, but more complicated. It too can be broken into $2$ steps.

Step 1. The special case when $f(a) = f'(a) = \cdots = f^{(k-1)}(a) = f(a+h)=0$. Then one must show the existence of some $\theta\in (0,1)$ such that $f^{(k)}(a+\theta h)=0$.

Step 2. Reduction of the general case to Step 1. One does this by defining $$ g_1(x) = f(x) - [f(a) + (x-a)f'(a) + \cdots + \frac{(x-a)^{k-1}}{(k-1)!}f^{(k-1)}(a)], $$ (note that this is exactly $R_{a,k-1}(x-a)$, not a coincidence) and $$ g_2(x) = g_1(x) - (\frac {(x-a)} h)^k g_1(a+h). $$ This ends up implying the conclusion. Check the details if you like!

Taylor's Theorem in higher dimensions

Now assume that $S\subset \R^n$ is an open set and that $f:S\to \R$ is a function of class $C^k$ on $S$.

For a point $\bfa\in S$, the $k$th order Taylor polynomial of $f$ at $\bfa$ is the unique polynomial of order at most $k$, denoted $P_{\bfa,k}(\bfh)$ such that \begin{align}\label{tkRn} f(\bfa) &= P_{\bfa,k}({\bf 0}) \\ \partial^\alpha f(\bfa) &= \partial^\alpha P_{\bfa,k}({\bf 0})\ \ \ \mbox{ for all partial derivatives of order up to }k.\nonumber \end{align}

One of the main difficulties with the theory is just the notation needed to write down explicit formulas for $P_{\bfa,k}(\bfh)$. For $k\ge 3$, it gets either very complicated, or else simple but incomprehensible. For this reason we will focus below on the case $k=2$ (quadratic Taylor polynomials), which is both the most important and the simplest case. But first we will state the general result, which guarantees that $P_{\bfa,k}(\bfh)$ is a very good approximation of $f(\bfa+\bfh)$, and that the quality of the approximation increases as $k$ increases.

Theorem 2: Taylor's Theorem in $n$ dimensions. Assume that $S\subset \R^n$ is an open set and that $f:S\to \R$ is a function of class $C^k$ on $S$. For $\bfa\in S$ and $\bfh\in \R^n$ such that $\bfa+\bfh\in S$, let $P_{\bfa,k}(\bfh)$ denote the $k$th-order Taylor polynomial at $\bfa$, and define the remainder $$ R_{\bfa,k}(\bfh) := f(\bfa+\bfh) - P_{\bfa,k}(\bfh). $$ Then $$ \lim_{\bfh\to {\bf 0}}\frac{R_{\bfa,k}(\bfh)}{|\bfh|^k} = 0. $$

We will not present the proof, but the main ideas will be outlined below in the case $k=2$.

The quadratic case

A formula for $P_{\bfa, 2}(\bfh)$.

According to the definition we have given, the 2nd order Taylor polynomial $P_{\bfa, 2}(\bfh)$ of $f$ at $\bfa$ is the quadratic polynomial such that $f(\bfa) = P_{\bfa, 2}({\bf 0})$, and such that all first and second partial derivatives of $P_{\bfa, 2}(\bfh)$ at $\bf h = 0$ coincide with the first and second partial derivatives of $f$ at $\bfa$.

Proposition 2. \begin{align} P_{\bfa, 2}(\bfh) &= f(\bfa) + \sum_{i=1}^n h_i \partial_i f(\bfa) +\frac 12 \sum_{i,j=1}^n h_i h_j \partial_i\partial_j f(\bfa) \nonumber \\ & = f(\bfa) + \nabla f(\bfa)\cdot \bfh + \frac 12 (H(\bfa) \bfh)\cdot \bfh\label{Pa2} \end{align} where $H(\bfa)$ denotes the matrix of second derivaties of $f$ at $\bfa$: $$ H(\bfa) := n\times n \mbox{ matrix whose }(i,j)\mbox{ entry is } \partial_i\partial_j f(\bfa). $$

The proof is an exercise; click here to see how to get started.

This can be proved by considering a general quadratic polynomial $q$ in the variable $\bfh$. This can always be written in the form then $q$ can be written in the form $$ q(\bfh) = \frac 12 (A \bfh)\cdot \bfh + \bfb \cdot \bfh + c, $$ where $A$ is a symmetric $n\times n$ matrix with entries $(a_{ij})$, $\bfb\in \R^n$, and $c\in \R$. One can then differentiate $q$ and see what conditions the coefficients $A,\bfb, c$ must satisfy in order for all derivatives of order up to $2$ at $\bfh = \bf 0$ to agree with the corresponding derivatives of $f$ at $\bfa$. (See exercises.)

In view of its importance, we restate Taylor's Theorem in the case $k=2$ (including some extra details that we did not mention in the general case).

Theorem 3. the quadratic case of Taylor's Theorem. Assume that $S\subset \R^n$ is an open set and that $f:S\to \R$ is a function of class $C^2$ on $S$.
Then for $\bfa\in S$ and $\bfh\in \R^n$ such that the line segment connecting $\bfa$ and $\bfa+\bfh$ is contained in $S$, there exists $\theta\in (0,1)$ such that \begin{equation}\label{ttlr2} f(\bfa+\bfh) = f(\bfa) + \nabla f(\bfa)\cdot \bfh + \frac 12 (H(\bfa+\theta\bfh) \bfh)\cdot \bfh. \end{equation} As a result (see exercises), \begin{equation}\label{tt2} \lim_{\bfh\to {\bf 0}}\frac{R_{\bfa,2}(\bfh)}{|\bfh|^2} = 0, \quad \mbox{ for }R_{\bfa,2}(\bfh) = f(\bfa+\bfh) - P_{\bfa,2}(\bfh) \end{equation} where the formula for $P_{\bfa,2}(\bfh)$ is given in \eqref{Pa2}.

Sketch of the proof

The idea of the proof is this:

See the exercises for the details.

More about the general case (optional)

For completeness, we state the formula for the $k$th order Taylor polynomial, for arbitrary $k\in \N$.

First, recall that in Section 2.5 we introduced the notation \begin{equation}\label{mix} \partial^\alpha f = (\frac{\partial}{\partial x_1 })^{\alpha_1}(\frac{\partial}{\partial x_{2} })^{\alpha_{2}}\cdots (\frac{\partial}{\partial x_n })^{\alpha_n}f, \end{equation} where $\alpha$ is a mutli-index; that is $\alpha$ has the form $(\alpha_1,\ldots, \alpha_n)$, where each $\alpha_j$ is a nonnegative integer. For such a multi-index, we will also use the notation $$ \alpha! = \alpha_1!\alpha_2! \cdots \alpha_n!\ , \qquad \bfh^\alpha := h_1^{\alpha_1}h_2^{\alpha_2}\ldots h_n^{\alpha_n}. $$ With this notation, the Taylor series has the formula \begin{equation}\label{tkRn2} P_{\bfa, k}(\bfh) = \sum_{\{\alpha : |\alpha|\le k\}} \frac {\bfh^\alpha}{\alpha!} {\partial^\alpha f(\bfa)}. \end{equation} Recall that $|\alpha| = \alpha_1+\ldots + \alpha_n$ is the order of the multi-index $\alpha$. Thus the formula involves all derivatives of order up to $k$ (including $\alpha = (0,\ldots, 0)$.)

As in the quadratic case, thie idea of the proof of Taylor's Theorem is

  1. Define $\phi(s) = f(\bfa+s\bfh)$.

  2. Apply the $1$-dimensional Taylor's Theorem (or formula \eqref{ttlr}) to $\phi$.

  3. Use the chain rule and induction (for example) to express the resulting facts about $\phi$ in terms of $f$. This is the hard part of the proof and involves showing that $$ \phi^{(j)}(s) = \sum_{\{\alpha : |\alpha| = j\}} \frac{h!}{\alpha!} \partial^\alpha f(\bfa+s\bfh) \quad\mbox{ for every }j\le k. $$

Problems

more questions may be added later.

Basic skills

Compute the second-order Taylor polynomial $P_{\bfa, 2}$ of the function $f = \ldots$ at the point $\bfa = \ldots$.
This question is just asking you to

For example:

  1. Compute the second-order Taylor polynomial of $f(x,y) = \frac{xy +y^2}{1+\cos^2 x}$ at $\bfa = (0,2)$.

  2. Compute the second-order Taylor polynomial of $f(x,y,z) = xy^2e^{z^2}$ at the point $\bfa = (1,1,1)$.

More advanced questions

(These questions are actually not very advanced, but nor do they involve new skills, introduced in this section, that you have to master.)

  1. Complete the proof of Proposition 2.

  2. Complete the proof of Theorem 3 by proving formula \eqref{ttt}. (To see the formula, click the link to expand the proof.)

  3. Use \eqref{ttlr2} to prove \eqref{tt2}.

  4. Prove that the formula in \eqref{tkRn2} satisfies \eqref{tkRn}. (very optional!)
    If you do this, since you have to differentiate $P_{\bfa, k}$ many times, you many find it psychologically easier to write it as a function of $\bfx$ rather than $\bfh$. This is just a notational change. ( When differentiating a function of $\bfh$, we have to interpret $$ \partial^\alpha = (\frac{\partial}{\partial h_1 })^{\alpha_1}(\frac{\partial}{\partial h_{2} })^{\alpha_{2}}\cdots (\frac{\partial}{\partial h_n })^{\alpha_n}, $$ whereas for function of $\bfx$, $\partial^\alpha$ is understood exactly as in \eqref{mix}.) The key point is that if $\alpha = (\alpha_1,\ldots, \alpha_n)$ and $\beta = (\beta_1,\ldots, \beta_n)$ are multi-indices, then $$ \mbox{ for }g^\beta(\bfx) := \bfx^\beta,\quad \partial^\alpha g^\beta({\bf 0}) = \begin{cases} \alpha! &\mbox{ if }\alpha = \beta \\ 0&\mbox{ if not } \end{cases} $$

    $\Leftarrow$  $\Uparrow$  $\Rightarrow$