3.3. The Inverse Mapping Theorem

$\newcommand{\R}{\mathbb R }$ $\newcommand{\N}{\mathbb N }$ $\newcommand{\Z}{\mathbb Z }$ $\newcommand{\bfa}{\mathbf a}$ $\newcommand{\bfb}{\mathbf b}$ $\newcommand{\bfc}{\mathbf c}$ $\newcommand{\bft}{\mathbf t}$ $\newcommand{\bff}{\mathbf f}$ $\newcommand{\bfF}{\mathbf F}$ $\newcommand{\bfk}{\mathbf k}$ $\newcommand{\bfg}{\mathbf g}$ $\newcommand{\bfG}{\mathbf G}$ $\newcommand{\bfh}{\mathbf h}$ $\newcommand{\bfu}{\mathbf u}$ $\newcommand{\bfv}{\mathbf v}$ $\newcommand{\bfx}{\mathbf x}$ $\newcommand{\bfp}{\mathbf p}$ $\newcommand{\bfy}{\mathbf y}$ $\newcommand{\ep}{\varepsilon}$

Transformations, and the Inverse Function Theorem

  1. Transformations
  2. Some important coordinate systems
  3. the Inverse Function Theorem
  4. Proof of the Inverse Function Theorem
  5. Proof of the Implicit Function Theorem
  6. Problems

Transformations

In this section we are interested in functions $\bff:U\to V$, where $U$ and $V$ are open subsets of $\R^n$. Such functions (which we will call transformations) are best visualized using before and after sketches, particularly when $n=2$.

We are especially interested in functions $\bff:U\to V$ as above, such that

The Inverse Function Theorem, discussed below, can help us to identify such functions (locally at least).

How to visualize a transformation

A good way to visualize a transformation $\bff$ in $2$ dimensions is to draw pairs of pictures showing

Thus, the left-hand picture is *before $\bff$, and the right-hand picture shows "after $\bff$.

Example 1. For instance, the picture below on the right shows how the Cartesian grid (before, on the left) is transformed by the linear mapping $\bff(x,y) = \binom{2y-x}{x+y}$ (after on the right).

drawing$\qquad$ drawing

The blue lines are transformed into into sloping lines parametrized by $\binom{2c-x}{x+c}$, where $c$ is constant and $x$ varies, and the red lines are transformed into sloping lines parametrized by $\binom{2y-c}{c+y}$, where $c$ is constant and $y$ varies.

Example 2. Below are two figures illustrating the function $\bff(x,y) = (x^2-y^2, 2xy)$. We will think of this as a mapping from the $x-y$ plane to the $u-v$ plane, where $u = x^2-y^2$ and $v=2xy$.

First:

drawing$\qquad\qquad$ drawing

On the left, a regular grid in the $x-y$ plane. (that is, the $x-y$ plane before applying $\bff$
On the right, its image in the $u-v$ plane, where red curves are images of vertical lines and blue curves are images of horizaontal lines. This is what becomes of the $x-y$ plane after applying $\bff$

Second, as a different way of looking at the same function, we can also picture a regular grid in the $u-v$ plane (after, still on the right), and the correspoxnding curves in the $x-y$ plane (before, on the left).

drawing$\qquad\qquad$ drawing

Note that the blue curves on the left are exactly level sets of $u(x,y) = x^2-y^2$, corresponding to sets in the $u-v$ plane where $u$ is constant (the vertical lines blue on the right) and the red curves are level sets of $v(x,y) = 2xy$, corresponding to the horizontal red lines on the left.

An interesting feature of this function is that in both pictures, the curved lines appear always to meet at right angles.

(Since $\bff(x,y) = \bff(-x,-y)$, every point in the $u-v$ plane corresponds to two points in the $x-y$ plane. So if we wished, in both figures above, we could throw out half of the left-hand pictures, for example, the part where $x<0$, without changing the right-hand pictures.)

Some important coordinate systems

polar coordinates in $\R^2$

As we know, polar coordinates $(r,\theta)$ are related to cartesian coordinates $(x,y)$ by \begin{equation}\label{pc} \binom x y = \binom{r\cos\theta}{r\sin\theta} = \bff(r,\theta). \end{equation} For $\bff$ to be a bijection between open sets, we have to restrict its domain and range. For example, one comon choice (among many possibilities) is to specify that $\bff$ is a function $U\to V$ where \begin{equation}\label{dr} U = \{(r,\theta) : r>0, |\theta|<\pi\},\qquad V := \R^2 \setminus \{ (x,0) : x\le 0\}. \end{equation} In this case, $\bff^{-1}:V\to U$ exists and is of class $C^1$. This follows from the implicit function theorem (see below).

drawing$\qquad\qquad$ drawing

Above: on the left, the set $U = \{(r,\theta) : r>0, |\theta|<\pi\}$ in the $r-\theta$ plane. On the right, $\theta = \,$constant lines (in blue) and $r = \,$constant curves (in red) in the $x-y$ plane.

spherical coordinates in $\R^3$

Spherical coordinates $(r,\theta,\varphi)$ are related to cartesian coordinates $(x,y,z)$ by $$ \left( \begin{array}{c} x\\y\\z \end{array} \right) \ = \ \left( \begin{array}{c} r\cos\theta\sin\varphi\\ r\sin\theta\sin\varphi\\ r\cos\varphi \end{array} \right) = \bff(r,\theta,\varphi). $$ See the practice problems.

As above, if we want $\bff$ to be a bijection between open sets $U$ and $V$, it is necessary to restrict the domain and range in some appropriate way.

cylindrical coordinates in $\R^3$

Cylindrical coordinates $(r,\theta, z)$ are related to cartesian coordinates $(x,y,z)$ by $$ \left( \begin{array}{c} x\\y\\z \end{array} \right) \ = \ \left( \begin{array}{c} r\cos\theta\\ r\sin\theta\\ z\end{array} \right) = \bff(r,\theta,z). $$ These are very closely related to polar coordinates in $\R^2$.

The Inverse function Theorem

The following theorem tells us when a transformation of class $C^1$ has a local inverse of class $C^1$.

Theorem 1: the Inverse Function Theorem Let $U$ and $V$ be open sets in $\R^n$, and assume that $\bff:U\to V$ is a mapping of class $C^1$.

Assume that $\bfa\in U$ is a point such that \begin{equation}\label{inv.hyp} D\bff(\bfa) \mbox{ is invertible}, \end{equation} and let $\bfb := \bff(\bfa)$. Then there exist open sets $M\subset U$ and $N\subset V$ such that

Moreover, if $\bfx\in M$ and $\bfy = \bff(\bfx)\in N$, then \begin{equation} D(\bff^{-1})(\bfy) = [D\bff(\bfx)]^{-1}. \label{Dfinv}\end{equation} In particular, $D(\bff^{-1})(\bfb ) = [D\bff(\bfa)]^{-1}$.

Remarks. Many things that we have said in Section 3.1 about the Implicit Function Theorem also apply, with some modifications, to the Inverse Function Theorem. For example:

For important and frequently-seen transformations, there are often explicit formulas for the inverse, so the Inverse Function Theorem, which guarantees the existence of an inverse without telling us what it is, may not seem very useful in these situations.

For example, this is the case for the transformation $\bff:U\to V$ that defines polar coordinates, see \eqref{pc} and \eqref{dr} above. For the $U$ and $V$ that we have chosen, we can write $(r,\theta) = \bff^{-1}(x,y)$ --- that is, we can write $r$ and $\theta$ as functions of $x$ and $y$, in a way that inverts $\bff:U\to V$ --- as follows: \begin{equation}\label{rtheta} r = \sqrt{x^2+y^2}\mbox{ for all }(x,y)\in V, \qquad \theta = \begin{cases} {-\pi+\tan^{-1}(y/x)} &\mbox{ if }x<0\mbox{ and }y<0\\\ {-\frac\pi 2} &\mbox{ if }x=0\mbox{ and }y<0\\\ {\tan^{-1}(y/x)} &\mbox{ if }x>0\\ {\frac\pi 2} &\mbox{ if }x=0\mbox{ and }y>0\\ {\pi+\tan^{-1}(y/x)} &\mbox{ if }x<0\mbox{ and }y>0 \end{cases} \end{equation} (This complicated formula would be much simpler if we had chosen $U = \{(r,\theta) : r>0, |\theta|<\pi/2\}$ and $V = \{ (x,y) : x>0\}$. Then we can just write $\theta = \tan^{-1}(y/x)$. This is often done, but it covers only half of the $xy$ plane.)

The above formula can be found by elementary considerations, without using the Inverse Function Theorem. Is the Inverse Function Theorem unnecessary here?

Not entirely. The above formula is complicated, and it is hard to see, from looking at the formula, whether $\theta$ is a differentiable function of $(x,y)$ at points where $x=0$, and if so, what are $\partial_x\theta$ and $\partial_y\theta$. The easiest way to understand this is by using the Inverse Function Theorem, rather than trying to compute partial derivatives directly, starting from the complicated formula for $\theta(x,y)$.

The proof of the Inverse Function Theorem (optional)

You have actually already proved the core of the Inverse Function Theorem in Homeworks 2 and 3. By the end of Homework 3, you proved the following:

Theorem 2. Assume that $\Sigma$ is an open subset of $\R^n$ that contains the origin, and that $\bfF:\Sigma\to \R^n$ is a function of class $C^1$ such that $$ \bfF({\bf 0}) = {\bf 0}, \quad | D\bfF({\bf 0}) - I|< 1. $$ Then there exist $r_0, r_1>0$ such that $$ \mbox{ if }\bfc \in B(r_1, {\bf 0}), \quad\mbox{ then there is a unique }\bfx\in B(r_0, {\bf 0}) \mbox{ such that }\bfF(\bfx) = \bfc. $$

The idea of the proof of the Inverse Function Theorem is to reduce it to the situation studied in Theorem 2. This involves some messing around with details, but is easier than the proof of Theorem 2, which you have found by yourself.

Sketch of the proof. First, a preliminary technical step. To make the conclusion of Theorem 2 look more like that of the Inverse Function Theorem one can reformulate it slightly, to assert that there exist open sets $M_0, N_0\subset \R^n$, both containing the origin, such that $N_0$ is open, and $\bfF$ is a one-to-one map from $M_0$ onto $N_0$.

Now, to reduce the Inverse Function Theorem to the situation considered in Theorem 2, we want to start with $\bff$ satisfying the hypotheses of the Inverse Function Theorem, and modify it to get a function $\bfF$ satisfying the hypotheses of Theorem 2. To do this, let $A := D\bff(\bfa)$, and define \begin{equation}\label{F.def1} \bfF(\bfx) = A^{-1}\left[ \bff(\bfx + \bfa) - \bfb\right], \quad\mbox{domain of $\bfF :=$ } \{\bfx\in \R^n : \bfx+\bfa \in U\}. \end{equation} One can then check that $\bfF$ satisfies the hypotheses of Theorem 2, and by applying (the reformulated version of) Theorem 2 to $\bfF$, one can deduce that there exist sets $M\subset U$ and $N\subset V$ such that $\bff$ is one-to-one from $M$ onto $N$. Hence $\bff:M\to N$ is invertible.

The differentiability of $\bff^{-1}$ is more difficult to establish. The proof, which we will omit, also establishes the validity of formula \eqref{Dfinv} for $D(\bff^{-1})$. Alternately, if we already somehow know that $\bff^{-1}$ is differentiable, we can use the chain rule to check that \eqref{Dfinv} holds. (In fact you have done this in Test 2.) $\quad \Box$

The proof of the Implicit Function Theorem (optional)

Finally, we discuss the proof of the Implicit Function Theorem.

Our idea will be to reduce it to the Inverse Function Theorem

First, let's recall the statement of the Theorem:

Implicit Function Theorem. Assume that $S$ is an open subset of $\R^{n+k}$ and that $\bfF:S\to \R^k$ is a function of class $C^1$. Assume also that $(\bfa, \bfb)$ is a point in $S$ such that $$ \bfF(\bfa, \bfb) = {\bf 0} \qquad\mbox{ and } \qquad \det D_\bfy \bfF(\bfa, \bfb) \ne 0. $$

i. Then there exist $r_0,r_1>0$ such that for every $\bfx\in \R^n$ such that $|\bfx-\bfa|< r_0$, there exists a unique $\bfy\in \R^k$ such that $|\bfy - \bfb|< r_1$ \begin{equation}\label{ImFT.eq1} \bfF(\bfx, \bfy) = \bf0. \end{equation} In other words, equation \eqref{ImFT.eq1} implicitly defines a function $\bfy = \bff(\bfx)$ for $\bfx\in \R^n$ near $\bfa$, with $\bfy = \bff(\bfx)$ close to $\bfb$. Note in particular that $\bff(\bfa) = \bfb$.

ii. Moreover, the function $\bff:B(r_0, \bfa)\to B(r_1,\bfb)\subset \R^k$ from part (i) above is of class $C^1$, and its derivatives may be determined by implicit differentiation.

Sketch of the Proof Assume that $\bfF$ satisfies the hypotheses of the Implicit Function Theorem, and define $\bfG:U\to \R^{n+k}$ by $$ G(\bfx, \bfy) = (\bfx, \bfF(\bfx, \bfy)). $$ Recalling that all vectors are column vectors by default, this means that \begin{equation}\label{G.def} G(\bfx, \bfy) = \left( \begin{array}{c} x_1\\ \vdots\\ x_n\\ F_1(\bfx, \bfy)\\ \vdots\\ F_k(\bfx, \bfy) \end{array} \right) . \end{equation}

Claim 1. $\det D\bfG(\bfa, \bfb) = \det D_y\bfF(\bfa, \bfb)$.

This is essentially a linear algebra exercise. Note that $$ D\bfG \ = \ \left( \begin{array}{ccccccc} 1&0&\cdots&0&0&\cdots &0\\ 0&1&\cdots&0&0&\cdots &0\\ \vdots&\vdots&\ddots&0&0&\cdots &0\\ 0&0&\cdots&1&0&\cdots &0\\ \partial_{x_1}F_1&\partial_2 F_1&\cdots&\partial_{x_n}F_1&\partial_{y_1}F_1&\cdots &\partial_{y_k}F_1 \\ \vdots &\vdots&\cdots&\vdots&\vdots&\cdots &\vdots\\ \partial_{x_1}F_k&\partial_2 F_k&\cdots&\partial_{x_n}F_k&\partial_{y_1}F_k&\cdots &\partial_{y_k}F_k \\ \end{array} \right) . $$

Here $D\bfG$ denotes the $(n+k)\times (n+k)$ matrix of derivatives of all components of $\bfG$ with respect to all variables $x_1,\ldots, x_n, y_1,\ldots, y_k$. If you are really good at linear algebra, you can stare at this monster and see why Claim 1 is true. If you want to write a detailed proof, induction on $n$ is an option.

Thus our assumption $ \det D_y\bfF(\bfa, \bfb)\ne 0$ implies that $\det D\bfG(\bfa, \bfb) \ne 0 $. Therefore according to the Inverse Function Theorem, there are open sets $M\subset U$ and $N\subset \R^{n+k}$ such that $(\bfa, \bfb)\in M$ and $\bfG:M\to N$ is invertible, with inverse of class $C^1$.

Next, for $\bfx$ such that $(\bfx, {\bf 0})\in N$, define $\bff(\bfx)$ by $$ \bfG^{-1}(\bfx, {\bf 0}) = (\bfx, \bff(\bfx)) = \left( \begin{array}{c} x_1\\ \vdots\\ x_n\\ f_1(\bfx) \\ \vdots\\ f_k(\bfx) \end{array} \right) . $$ This $\bff$ turns out to be the implicit function whose existence we are trying to prove. Its definition says that $$ \bfy = \bff(\bfx) \qquad \iff\qquad (\bfx,\bfy)\in M\mbox{ and }G(\bfx, \bfy) = (\bfx , {\bf 0}). $$ And of course $\bfG(\bfx, \bfy) = (\bfx, {\bf 0 })$ is equivalent to $\bfF(\bfx, \bfy) = \bf 0$.

Since $\bfG^{-1}$ is $C^1$, the same is true of $\bff$.

To complete the proof, it is still necessary to worry about a few details related to the choice of $r_0,r_1$ in the conclusion of the theorem, but what we have said above is the main point. $\quad \Box$

Problems

Basic skills

  1. Let $$ \binom u v = \bff(x,y) = \binom{f_1(x,y)}{f_2(x,y)}, \quad\mbox{ where }\quad f_1(x,y) = y-\frac 1 3x^3 \mbox{ and } f_2(x,y) = ye^x. $$ The picture below shows level sets of $f_1$ (in blue) and $f_2$ (in red). Thus, the red and blue curves show points in the $xy$ plane that correspond to a regular grid of horizontal (red) and vertical (blue) lines in the $uv$ plane.

    drawing$\qquad\qquad$

    (The red curves stop where they do because filling the whole region with them would be computationally demanding, but one can imagine how they should continue.)

  2. Same questions as in the previous problem, for $$ \binom u v = \bff(x,y) = \binom{f_1(x,y)}{f_2(x,y)}, \quad\mbox{ where }\quad \begin{array}{rl} f_1(x,y) &= y+ 2 y^3 - xy^2+x^2y-x^3 \\ f_2(x,y) &= y+x^3. \end{array} $$ with some level sets of $f_1$ (blue) and $f_2$ (red) pictured below.
    Hint: For part (c), it is a good idea to look at the sign of $\partial_i f_j$ for $i,j=1,2$ (that is, the components of the matrix $D\bff$.)

    drawing$\qquad\qquad$

  3. Same questions as in the previous problems, except that this time you should also sketch a picture showing some level curves $u = $ constant and $v= $constant, for $$ \binom u v = \bff(x,y) = \binom{y-x^2}{\frac y{1+x^2}}, $$ You might like to use different colors for level sets of $u$ and $v$.

  4. For the functions $r(x,y)$ and $\theta(x,y)$ defined in \eqref{rtheta}, check that they are both $C^1$ functions of $(x,y)$ everywhere in their domain, and compute all partial derivatives $$ \left( \begin{array}{cc} \partial_x r & \partial_y r\\ \partial_x \theta & \partial_y \theta \end{array}\right). $$ Do this by using the Inverse Function Theorem and the easily-differentiated expressions for $(x,y)$ as functions of $(r,\theta)$. You may accept it as true, without verifying in detail, that the functions defined in \eqref{rtheta} are indeed $\bff^{-1}(x,y)$, where $\bff(r,\theta)$ is defined in \eqref{pc}, \eqref{dr}.

  5. Let $$ \bff(x,y) = \binom{e^x(y^2-3x+1)}{x\ln(y^2+1)+y}, $$ and note that $\bff(0,0) = (1,0)$

  6. Let $\bff:\R^3\to \R^3$ be the function that defines spherical coordinates, $$ \bff(r,\theta,\varphi)\ = \ \left( \begin{array}{c} r\cos\theta\sin\varphi\\ r\sin\theta\sin\varphi\\ r\cos\varphi \end{array} \right) = \left( \begin{array}{c} x\\y\\z \end{array} \right) . $$

Other questions

Please make sure that you master the Basic Skills before looking at these.

  1. Above, we have sketched a proof showing that the Implicit Function Theorem can be deduced from the Inverse Function Theorem.
    Perhaps weirdly, it is also true that the Inverse Function Theorem can be deduced from the Implicit Function Theorem. Do this; that is, assume that you know the Implicit Function Theorem is true, and use it to prove the Inverse Function Theorem.
    Hint. The Inverse Function Theorem asks about the possibility of solving equations of the form $\bff(\bfx)= \bfy$ for $\bfx$ as a function of $\bfy$.
    The Implicit Function Theorem guarantees that under certain hypotheses, you can solve equations of the form $\bfF(\bfx, \bfy) ={\bf 0}$ for $\bfy$ as a function of $\bfx$. (Note, if you want, you could swap the roles of $\bfx$ and $\bfy$ and solve $\bfF(\bfy, \bfx) = \bf 0 $ for $\bfx$ as a function of $\bfy$.)
    What this suggests is: rewrite the equation $\bff(\bfx) = \bfy$ in the form $\bfF(\bfy, \bfx) = \bf0$ for a suitable function $\bfF$. You may be able to use the Implicit Function Theorem to say something about solvability.

  2. Fill in some details in the proof of the Inverse Function Theorem as sketched above. For example, verify that the function $\bfF$ defined in \eqref{F.def1} satisfies the hypotheses of Theorem 2.

  3. Fill in some details in the proof of the Implicit Function Theorem as sketched above. For example, verify that the function $\bfG$ define in \eqref{G.def} satisfies
    $\det D\bfG(\bfa, \bfb) = \det D_y\bfF(\bfa, \bfb)$, as we have claimed.

    $\Leftarrow$  $\Uparrow$  $\Rightarrow$