\(\renewcommand{\R}{\mathbb R }\)
\(\Leftarrow\) \(\Uparrow\) \(\Rightarrow\)
Recall from MAT 137, the one dimensional Taylor polynomial gives us a way to approximate a \(C^k\) function with a polynomial.
Since the \(j\)th derivative of a polynomial evaluated at \(0\) gives the \(j\)th coefficient times \(j!\), we can show that \[\begin{align} P_{a,k}(x) &= f(a) + f'(a)x + f''(a)\frac {x^2}{2} + \cdots + f^{(k)}(a)\frac {x^k}{k!} \label{ttr1}\\\ &= \sum_{j=0}^k f^{(j)}(a)\frac {x^j}{j!} \nonumber \end{align}\]
Taylor’s Theorem guarantees that \(P_{a,k}(h)\) is a very good approximation of \(f(a+h)\) for small \(h\), and that the quality of the approximation increases as \(k\) increases.
When \(k=1\), we have \(P_{a,1}(x)=f(a)+f'(a)x\), and so \[R_{a,1}(h)=f(a+h)-f(a)-f'(a)h.\] Our alternative definition of the derivative tells us that \(\displaystyle{\lim_{h\to 0}\frac{R_{a,1}(h)}{h}} = 0.\) Next, we will show that this extends to higher values of \(k\). Then we will generalize Taylor polynomials to give approximations of multivariable functions, provided their partial derivatives all exist and are continuous up to some order.
In this case, Taylor’s Theorem relies on
This can be considered to be a second-order Mean Value Theorem.
This lemma implies the \(k=2\) case of Taylor’s Theorem, since we have \[\begin{align*} R_{a,2}(h) &= f(a+h) - \left[ f(a) + h f'(a) +\frac {h^2}2f''(a)\right] \\ &= \frac {h^2}2 \left[ f''(a+\theta h) - f''(a)\right]. \end{align*}\] Thus \[ \frac{R_{a,2}(h)}{h^2} = \frac 12\left[ f''(a+\theta h) - f''(a)\right] \] which tends to \(0\) as \(h\to 0\), since \(f''\) is continuous by assumption.
The proof of Lemma 2 is harder.
First, to make things as easy as possible, let’s suppose that \(f'(a)=0\) and that \(h\) is a point such that \(f(a+h)= f(a)\). We will consider the general case later. Then the statement we want to prove, see \(\eqref{ttlr}\), reduces to the following: \[\begin{equation}\label{sc1} \text{ if }f(a) = f'(a) = f(a+h) = 0, \ \ \text{ then } \ \exists \theta\in (0,1) \text{ such that }f''(a+\theta h) = 0. \end{equation}\]
We will establish this using use Rolle’s Theorem, which we recall is a special case of the single variable Mean Value Theorem. It implies that if \(g\) is differentiable on an interval \((c,d)\), and if both \(a\) and \(a+h\) are points in \((c,d)\) such that \(g(a)=g(a+ h)\), then there exists \(\alpha\in (0,1)\) such that \(g'(a+\alpha h)=0\).
Now, since \(f(a) = f(a+h)\), Rolle’s Theorem implies that there is a some \(\theta_1\in (0,1)\) such that \(f'(a+\theta_1 h) = 0\).
Next, note that \(f'(a) = 0\) by hypothesis, and we have just shown that \(f'(a+\theta_1h)= 0\). So we can apply Rolle’s Theorem again, this time to \(f'\), and with \(\theta_1 h\) in place of \(h\), to find that there exists some \(\theta_2\in (0,1)\) such that \(f''(a+ \theta_2\theta_1 h)= 0\). If we define \(\theta = \theta_2\theta_1\), then this is exactly \(\eqref{sc1}\). So we have finished Step 1.
For completeness, we outline the proof of Taylor’s Theorem for \(k\ge 3\).
Click here only if interested.
First we need the following generalization of Lemma 2.
Once this is known, it follows that \[ \frac 1{h^k}R_{a,k}(h) = \frac 1{k!}\left[ f^{(k)}(a+\theta h) - f^{(k)}(a)\right], \] and the right-hand side tends to \(0\) as \(h\to 0\), since \(f^{(k)}\) is continuous. The proof of this lemma is similar in spirit to the basic version, but more complicated. It too can be broken into \(2\) steps:
The special case when \(f(a) = f'(a) = \cdots = f^{(k-1)}(a) = f(a+h)=0\). Then we must show the existence of some \(\theta\in (0,1)\) such that \(f^{(k)}(a+\theta h)=0\).
One of the main difficulties with the theory is just the notation needed to write down explicit formulas for \(P_{\mathbf a,k}(\mathbf h)\). They will require \(\binom {n+k}k\) terms. For example, with \(n=2\) and \(k=3\), there are \(10\) (the value at the point, \(2\) first derivatives, \(3\) second derivatives, and \(4\) third derivatives). This makes notation either very complicated, or simple but incomprehensible. For this reason we will focus on the case of quadratic Taylor polynomials, \(k=2\), which is the most important after linear approximation, and the simplest. First we will state the general result, which guarantees that \(P_{\mathbf a,k}(\mathbf h)\) is a very good approximation of \(f(\mathbf a+\mathbf h)\), and that the quality of the approximation increases as \(k\) increases.
The main ideas are outlined in the case \(k=2\).
According to the definition we have given, the second order Taylor polynomial \(P_{\mathbf a, 2}(\mathbf h)\) of \(f\) at \(\mathbf a\) is the quadratic polynomial such that \(f(\mathbf a) = P_{\mathbf a, 2}({\bf 0})\), and all first and second partial derivatives of \(P_{\mathbf a, 2}(\mathbf h)\) at \(\bf h = 0\) equal the first and second partial derivatives of \(f\) at \(\mathbf a\). Note that \(\mathbf h\) represents a “small” vector, that is, \(\mathbf h=\mathbf x-\mathbf a\) for some \(\mathbf x\) near \(\mathbf a\). We have a condensed notation for these terms by using the gradient and Hessian:
The proof is exercise 5.
In view of its importance, we restate Taylor’s Theorem in the case \(k=2\) (including some extra details that we did not mention in the general case).
Sketch of the proof
For completeness, we state the formula for the \(k\)th order Taylor polynomial, for arbitrary \(k\in \mathbb N\).
First, recall that in Section 2.5 we introduced the notation \[\begin{equation}\label{mix} \partial^\alpha f = \left(\frac{\partial}{\partial x_1 }\right)^{\alpha_1}\left(\frac{\partial}{\partial x_{2} }\right)^{\alpha_{2}}\cdots \left(\frac{\partial}{\partial x_n }\right)^{\alpha_n}f, \end{equation}\] where \(\alpha\) is a mutli-index; that is \(\alpha\) has the form \((\alpha_1,\ldots, \alpha_n)\), where each \(\alpha_j\) is a nonnegative integer. For such a multi-index, we will also use the notation \[ \alpha! = \alpha_1!\alpha_2! \cdots \alpha_n!\ , \qquad \mathbf h^\alpha = h_1^{\alpha_1}h_2^{\alpha_2}\ldots h_n^{\alpha_n}. \] With this notation, the Taylor polynomial of order \(k\) has the formula \[\begin{equation}\label{tkRn2} P_{\mathbf a, k}(\mathbf h) = \sum_{\{\alpha : |\alpha|\le k\}} \frac {\mathbf h^\alpha}{\alpha!} {\partial^\alpha f(\mathbf a)}. \end{equation}\] Recall that \(|\alpha| = \alpha_1+\ldots + \alpha_n\) is the order of the multi-index \(\alpha\). Thus the formula involves all derivatives of order up to \(k\), including the value at the point, when \(\alpha = (0,\ldots, 0)\).
As in the quadratic case, the idea of the proof of Taylor’s Theorem is
Define \(\phi(s) = f(\mathbf a+s\mathbf h)\).
Apply the \(1\)-dimensional Taylor’s Theorem or formula \(\eqref{ttlr}\) to \(\phi\).
Use the chain rule and induction to express the resulting facts about \(\phi\) in terms of \(f\). This is the hard part of the proof and involves showing that \[ \phi^{(j)}(s) = \sum_{\{\alpha : |\alpha| = j\}} \frac{h!}{\alpha!} \partial^\alpha f(\mathbf a+s\mathbf h) \quad\text{ for every }j\le k. \]
While the \(k\)th order Taylor polynomial can always be computed using formula \(\eqref{tkRn2}\), in practice it is a laborious task and you will not be asked to use this formula for any \(k>2\). Instead, there is a trick to computing higher order Taylor polynomials that avoids having to compute all the partial derivatives \(\partial^\alpha f(\mathbf a)\) appearing in \(\eqref{tkRn2}\).
Recall from definition \(\eqref{tkRn}\) and the following theorem that the Taylor Polynomial \(P_{\mathbf a,k}(\mathbf h)\) is a polynomial of order at most \(k\) such that \[f(\mathbf a+\mathbf h)=P_{\mathbf a, k}(\mathbf h) + R_{\mathbf a,k}(\mathbf h)\] where \[\lim_{\mathbf h\to\mathbf 0} \frac{R_{\mathbf a,k}(\mathbf h)}{|\mathbf h|^k}=\mathbf 0.\]
It can be shown that if \(Q(\bf h)\) is any degree \(k\) or lower polynomial such that \(f(\mathbf a+\mathbf h)=Q(\mathbf h) + R(\mathbf h)\) and \(\lim_{\mathbf h\to\bf0} R(\mathbf h)/|\mathbf h|^k=\bf0\), then \(Q(\mathbf h)=P_{\mathbf a,k}(\bf h)\). In other words, there is a unique, best \(k\)th order approximation to \(f\), which is why we have been writing the Taylor polynomial instead of a Taylor polynomial. Instead of using the standard formula and computing all \(k\)th order and lower partial derivatives, we can look for any polynomial \(Q(\bf h)\) that satisfies these properties. This is only useful if we have a good idea for a guess, which we will get by using our knowledge of one variable Taylor polynomials. When a multivariable function is built out out of simpler one-variable functions, we can manipulate the one variable Taylor polynomials as demonstrated in the example below.
Solution
To do this, recall the Taylor expansions \[
e^s=1+s+\frac{s^2}{2}+\frac{s^3}{3!}+\frac{s^4}{4!}+\frac{s^5}{5!}+\cdots
\] and \[
\cos(t)=1-\frac{t^2}{2!}+\frac{t^4}{4!}+\cdots.
\]
We want to compute \[ f(\mathbf a+\mathbf h)=f(h_1,h_2,h_3)=e^{h_1^2-h_2h_3^2} \cos\left(h_1h_3+h_2^2 \right) \] so let’s substitute \(s=h_1^2-h_2h_3^2\) into the Taylor expansion for \(e^s\) and \(t=h_1h_3 + h_2^2\) into the Taylor expansion for \(\cos(t)\) and only keep track of terms which have total degree in the \(h_i\)’s of 5 or lower: \[ e^{h_1^2-h_2h_3^2}= 1+\left(h_1^2-h_2h_3^2\right) +\frac{\left(h_1^2-h_2h_3^2\right)^2}{2} +\cdots = 1+h_1^2-h_2h_3^2+\frac{h_1^4}{2}-h_1^2h_2h_3^2+\cdots \] \[ \cos(h_1h_3+h_2^2)= 1-\frac{\left(h_1h_3+h_2^2\right)^2}{2}+\cdots = 1-\frac{h_1^2h_3^2+h_2^4}{2}-h_1h_3h_2^2+\cdots \]
Multiplying these and again only keeping track of terms of total degree \(\leq 5\) in the \(h_i\)’s, \[\begin{align*} f(h_1,h_2,h_3)&= \left(1+h_1^2-h_2h_3^2+\frac{h_1^4}{2}-h_1^2h_2h_3^2+\cdots\right) \left(1-\frac{h_1^2h_3^2+h_2^4}{2}-h_1h_3h_2^2+\cdots\right) \\\ &=1+h_1^2-h_2h_3^2+\frac{h_1^4}{2}-h_1^2h_2h_3^2 -\frac{h_1^2h_3^2+h_2^4}{2}-h_1h_3h_2^2+\cdots \\ &= Q(\mathbf h)+R(\mathbf h) \end{align*}\] where \[ Q(\mathbf h)=1+h_1^2-h_2h_3^2+\frac{h_1^4}{2}- \frac{h_1^2h_3^2+h_2^4}{2}-h_1h_3h_2^2-h_1^2h_2h_3^2 \] and \(R(\mathbf h)\) contains all the remaining terms of degree 6 or higher we have been ignoring in this computation. Since only degree 6 or higher terms appear in \(R(\mathbf h)\), \[ \lim_{\mathbf h\to{\bf 0}} \frac{R(\mathbf h)}{|\mathbf h|^5} = 0 \] and therefore \[ P_{{\bf0},5}(\mathbf h)=1+h_1^2-h_2h_3^2+\frac{h_1^4}{2}- \frac{h_1^2h_3^2+h_2^4}{2}-h_1h_3h_2^2-h_1^2h_2h_3^2 \] by the uniqueness of Taylor polynomials.Before deciding that the computation in Example 1 is complicated, try to use formula \(\eqref{tkRn2}\) directly by computing all partial derivatives up to order \(5\) at \(\mathbf 0\). There are \(\binom 85=56\) of them. Is the construction from single variable functions more or less work?
You will be asked to compute the second-order Taylor polynomial \(P_{\mathbf a, 2}\) of a function at a point \(\mathbf a\). These questions ask you to
Recall that the formula for \(P_{\mathbf a, 2}\) has the zero dimensional approximation using the function, the one dimensional approximation using the gradient, and the two dimensional approximation using the Hessian, and that all approximations are evaluated at \(a\).
For example:
Compute the second-order Taylor polynomial of \(f(x,y) = \frac{xy +y^2}{1+\cos^2 x}\) at \(\mathbf a = (0,2)\).
Compute the second-order Taylor polynomial of \(f(x,y,z) = xy^2e^{z^2}\) at the point \(\mathbf a = (1,1,1)\).
You will also need to compute a higher order Taylor polynomial \(P_{\mathbf a, k}\) of a function at a point. Questions of this type involve using your knowledge of one variable Taylor polynomials to compute a higher order Taylor polynomial.
For example:
Compute the fifth-order Taylor polynomial of \(f(x,y) = \frac{xy +y^2}{1-xy}\) at \(\mathbf a = (0,0)\).
Compute the fourth-order Taylor polynomial of \(f(x,y,z) = xy^2e^{z^2}\) at the point \(\mathbf a = (1,1,1)\).
Here you will practice proof writing by filling in details or completing proofs from this section.
5. Complete the proof of Lemma 2.
Complete the proof of Theorem 3 by proving formula \(\eqref{ttt}\).
Use \(\eqref{ttlr2}\) to prove \(\eqref{tt2}\).
Prove that the formula in \(\eqref{tkRn2}\) satisfies \(\eqref{tkRn}\).
\(\Leftarrow\) \(\Uparrow\) \(\Rightarrow\)
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Canada License.