3.3. The Inverse Mapping Theorem

$\newcommand{\R}{\mathbb R }$ $\newcommand{\N}{\mathbb N }$ $\newcommand{\Z}{\mathbb Z }$ $\newcommand{\bfa}{\mathbf a}$ $\newcommand{\bfb}{\mathbf b}$ $\newcommand{\bfc}{\mathbf c}$ $\newcommand{\bft}{\mathbf t}$ $\newcommand{\bff}{\mathbf f}$ $\newcommand{\bfF}{\mathbf F}$ $\newcommand{\bfk}{\mathbf k}$ $\newcommand{\bfg}{\mathbf g}$ $\newcommand{\bfG}{\mathbf G}$ $\newcommand{\bfh}{\mathbf h}$ $\newcommand{\bfu}{\mathbf u}$ $\newcommand{\bfv}{\mathbf v}$ $\newcommand{\bfx}{\mathbf x}$ $\newcommand{\bfp}{\mathbf p}$ $\newcommand{\bfy}{\mathbf y}$ $\newcommand{\ep}{\varepsilon}$

Transformations, and the Inverse Function Theorem

Transformations

Some important coordinate systems

the Inverse Function Theorem

Proof of the Inverse Function Theorem

Proof of the Implicit Function Theorem

Problems

Transformations

In this section we are interested in functions $\bff:U\to V$, where $U$ and $V$ are open subsets of $\R^n$. Such functions (which we will call transformations) are best visualized using before and after sketches, particularly when $n=2$.

We are especially interested in functions $\bff:U\to V$ as above, such that

$\bff$ is is a bijection (that is, both one-to-one and onto). This implies that $\bff^{-1}:V\to U$ exists.
Both $\bff$ and $\bff^{-1}$ are of class $C^1$.
Such a transformation $\bff$ may be considered a change of coordinates.

The Inverse Function Theorem, discussed below, can help us to identify such functions (locally at least).

How to visualize a transformation

A good way to visualize a transformation $\bff$ in $2$ dimensions is to draw pairs of pictures showing

on the left, a collection of lines or curves in the $x-y$ plane, or some other reference plane (such as the $r-\theta$ plane, in one example below.)
on the right, the image via the transformation $\bff$ of these lines or curves.

Thus, the left-hand picture is *before $\bff$, and the right-hand picture shows "after $\bff$.

Example 1. For instance, the picture below on the right shows how the Cartesian grid (before, on the left) is transformed by the linear mapping $\bff(x,y) = \binom{2y-x}{x+y}$ (after on the right).

$drawing$ $\qquad$ $drawing$

The blue lines are transformed into into sloping lines parametrized by $\binom{2c-x}{x+c}$, where $c$ is constant and $x$ varies, and the red lines are transformed into sloping lines parametrized by $\binom{2y-c}{c+y}$, where $c$ is constant and $y$ varies.

Example 2. Below are two figures illustrating the function $\bff(x,y) = (x^2-y^2, 2xy)$. We will think of this as a mapping from the $x-y$ plane to the $u-v$ plane, where $u = x^2-y^2$ and $v=2xy$.

First:

$drawing$ $\qquad\qquad$ $drawing$

On the left, a regular grid in the $x-y$ plane. (that is, the $x-y$ plane before applying $\bff$
On the right, its image in the $u-v$ plane, where red curves are images of vertical lines and blue curves are images of horizaontal lines. This is what becomes of the $x-y$ plane after applying $\bff$

Second, as a different way of looking at the same function, we can also picture a regular grid in the $u-v$ plane (after, still on the right), and the correspoxnding curves in the $x-y$ plane (before, on the left).

$drawing$ $\qquad\qquad$ $drawing$

Note that the blue curves on the left are exactly level sets of $u(x,y) = x^2-y^2$, corresponding to sets in the $u-v$ plane where $u$ is constant (the vertical lines blue on the right) and the red curves are level sets of $v(x,y) = 2xy$, corresponding to the horizontal red lines on the left.

An interesting feature of this function is that in both pictures, the curved lines appear always to meet at right angles.

(Since $\bff(x,y) = \bff(-x,-y)$, every point in the $u-v$ plane corresponds to two points in the $x-y$ plane. So if we wished, in both figures above, we could throw out half of the left-hand pictures, for example, the part where $x<0$, without changing the right-hand pictures.)

Some important coordinate systems

polar coordinates in $\R^2$

As we know, polar coordinates $(r,\theta)$ are related to cartesian coordinates $(x,y)$ by \begin{equation}\label{pc} \binom x y = \binom{r\cos\theta}{r\sin\theta} = \bff(r,\theta). \end{equation} For $\bff$ to be a bijection between open sets, we have to restrict its domain and range. For example, one comon choice (among many possibilities) is to specify that $\bff$ is a function $U\to V$ where \begin{equation}\label{dr} U = \{(r,\theta) : r>0, |\theta|<\pi\},\qquad V := \R^2 \setminus \{ (x,0) : x\le 0\}. \end{equation} In this case, $\bff^{-1}:V\to U$ exists and is of class $C^1$. This follows from the implicit function theorem (see below).

$drawing$ $\qquad\qquad$ $drawing$

Above: on the left, the set $U = \{(r,\theta) : r>0, |\theta|<\pi\}$ in the $r-\theta$ plane. On the right, $\theta = \,$constant lines (in blue) and $r = \,$constant curves (in red) in the $x-y$ plane.

spherical coordinates in $\R^3$

Spherical coordinates $(r,\theta,\varphi)$ are related to cartesian coordinates $(x,y,z)$ by $$ \left( \begin{array}{c} x\\y\\z \end{array} \right) \ = \ \left( \begin{array}{c} r\cos\theta\sin\varphi\\ r\sin\theta\sin\varphi\\ r\cos\varphi \end{array} \right) = \bff(r,\theta,\varphi). $$ See the practice problems.

As above, if we want $\bff$ to be a bijection between open sets $U$ and $V$, it is necessary to restrict the domain and range in some appropriate way.

cylindrical coordinates in $\R^3$

Cylindrical coordinates $(r,\theta, z)$ are related to cartesian coordinates $(x,y,z)$ by $$ \left( \begin{array}{c} x\\y\\z \end{array} \right) \ = \ \left( \begin{array}{c} r\cos\theta\\ r\sin\theta\\ z\end{array} \right) = \bff(r,\theta,z). $$ These are very closely related to polar coordinates in $\R^2$.

The Inverse function Theorem

The following theorem tells us when a transformation of class $C^1$ has a local inverse of class $C^1$.

Theorem 1: the Inverse Function Theorem Let $U$ and $V$ be open sets in $\R^n$, and assume that $\bff:U\to V$ is a mapping of class $C^1$.

Assume that $\bfa\in U$ is a point such that \begin{equation}\label{inv.hyp} D\bff(\bfa) \mbox{ is invertible}, \end{equation} and let $\bfb := \bff(\bfa)$. Then there exist open sets $M\subset U$ and $N\subset V$ such that

$\bfa\in M$ and $\bfb\in N$,
$\bff$ is one-to-one from $M$ onto $N$ (hence invertible), and
the inverse function $\bff^{-1}:N\to M$ is of class $C^1$.

Moreover, if $\bfx\in M$ and $\bfy = \bff(\bfx)\in N$, then \begin{equation} D(\bff^{-1})(\bfy) = [D\bff(\bfx)]^{-1}. \label{Dfinv}\end{equation} In particular, $D(\bff^{-1})(\bfb ) = [D\bff(\bfa)]^{-1}$.

Remarks. Many things that we have said in Section 3.1 about the Implicit Function Theorem also apply, with some modifications, to the Inverse Function Theorem. For example:

The Inverse Function Theorem can be understood as giving information about the solvability of a system of $n$ nonlinear equations in $n$ unknowns. It says: suppose we are given $\bff:U\to V$ and $\bfy\in V$, and we want to find $\bfx$ solving $$ \bff(\bfx) = \bfy. $$ This is often an impossible problem to solve by hand. It may be very difficult even to know if any solution exists. The Inverse Function Theorem says: if we know that $\bff(\bfa) = \bfb$, then for $\bfy$ near $\bfb$, the solvability of the nonlinear system can be established by considering a much easier question about linear algebra (ie, whether the matrix $D\bff(\bfa)$ is invertible).

For important and frequently-seen transformations, there are often explicit formulas for the inverse, so the Inverse Function Theorem, which guarantees the existence of an inverse without telling us what it is, may not seem very useful in these situations.

For example, this is the case for the transformation $\bff:U\to V$ that defines polar coordinates, see \eqref{pc} and \eqref{dr} above. For the $U$ and $V$ that we have chosen, we can write $(r,\theta) = \bff^{-1}(x,y)$ --- that is, we can write $r$ and $\theta$ as functions of $x$ and $y$, in a way that inverts $\bff:U\to V$ --- as follows: \begin{equation}\label{rtheta} r = \sqrt{x^2+y^2}\mbox{ for all }(x,y)\in V, \qquad \theta = \begin{cases} {-\pi+\tan^{-1}(y/x)} &\mbox{ if }x<0\mbox{ and }y<0\\\ {-\frac\pi 2} &\mbox{ if }x=0\mbox{ and }y<0\\\ {\tan^{-1}(y/x)} &\mbox{ if }x>0\\ {\frac\pi 2} &\mbox{ if }x=0\mbox{ and }y>0\\ {\pi+\tan^{-1}(y/x)} &\mbox{ if }x<0\mbox{ and }y>0 \end{cases} \end{equation} (This complicated formula would be much simpler if we had chosen $U = \{(r,\theta) : r>0, |\theta|<\pi/2\}$ and $V = \{ (x,y) : x>0\}$. Then we can just write $\theta = \tan^{-1}(y/x)$. This is often done, but it covers only half of the $xy$ plane.)

The above formula can be found by elementary considerations, without using the Inverse Function Theorem. Is the Inverse Function Theorem unnecessary here?

Not entirely. The above formula is complicated, and it is hard to see, from looking at the formula, whether $\theta$ is a differentiable function of $(x,y)$ at points where $x=0$, and if so, what are $\partial_x\theta$ and $\partial_y\theta$. The easiest way to understand this is by using the Inverse Function Theorem, rather than trying to compute partial derivatives directly, starting from the complicated formula for $\theta(x,y)$.

The proof of the Inverse Function Theorem (optional)

You have actually already proved the core of the Inverse Function Theorem in Homeworks 2 and 3. By the end of Homework 3, you proved the following:

Theorem 2. Assume that $\Sigma$ is an open subset of $\R^n$ that contains the origin, and that $\bfF:\Sigma\to \R^n$ is a function of class $C^1$ such that $$ \bfF({\bf 0}) = {\bf 0}, \quad | D\bfF({\bf 0}) - I|< 1. $$ Then there exist $r_0, r_1>0$ such that $$ \mbox{ if }\bfc \in B(r_1, {\bf 0}), \quad\mbox{ then there is a unique }\bfx\in B(r_0, {\bf 0}) \mbox{ such that }\bfF(\bfx) = \bfc. $$

The idea of the proof of the Inverse Function Theorem is to reduce it to the situation studied in Theorem 2. This involves some messing around with details, but is easier than the proof of Theorem 2, which you have found by yourself.

Sketch of the proof. First, a preliminary technical step. To make the conclusion of Theorem 2 look more like that of the Inverse Function Theorem one can reformulate it slightly, to assert that there exist open sets $M_0, N_0\subset \R^n$, both containing the origin, such that $N_0$ is open, and $\bfF$ is a one-to-one map from $M_0$ onto $N_0$.

Now, to reduce the Inverse Function Theorem to the situation considered in Theorem 2, we want to start with $\bff$ satisfying the hypotheses of the Inverse Function Theorem, and modify it to get a function $\bfF$ satisfying the hypotheses of Theorem 2. To do this, let $A := D\bff(\bfa)$, and define \begin{equation}\label{F.def1} \bfF(\bfx) = A^{-1}\left[ \bff(\bfx + \bfa) - \bfb\right], \quad\mbox{domain of $\bfF :=$ } \{\bfx\in \R^n : \bfx+\bfa \in U\}. \end{equation} One can then check that $\bfF$ satisfies the hypotheses of Theorem 2, and by applying (the reformulated version of) Theorem 2 to $\bfF$, one can deduce that there exist sets $M\subset U$ and $N\subset V$ such that $\bff$ is one-to-one from $M$ onto $N$. Hence $\bff:M\to N$ is invertible.

The differentiability of $\bff^{-1}$ is more difficult to establish. The proof, which we will omit, also establishes the validity of formula \eqref{Dfinv} for $D(\bff^{-1})$. Alternately, if we already somehow know that $\bff^{-1}$ is differentiable, we can use the chain rule to check that \eqref{Dfinv} holds. (In fact you have done this in Test 2.) $\quad \Box$

The proof of the Implicit Function Theorem (optional)

Finally, we discuss the proof of the Implicit Function Theorem.

Our idea will be to reduce it to the Inverse Function Theorem

First, let's recall the statement of the Theorem:

Implicit Function Theorem. Assume that $S$ is an open subset of $\R^{n+k}$ and that $\bfF:S\to \R^k$ is a function of class $C^1$. Assume also that $(\bfa, \bfb)$ is a point in $S$ such that $$ \bfF(\bfa, \bfb) = {\bf 0} \qquad\mbox{ and } \qquad \det D_\bfy \bfF(\bfa, \bfb) \ne 0. $$

i. Then there exist $r_0,r_1>0$ such that for every $\bfx\in \R^n$ such that $|\bfx-\bfa|< r_0$, there exists a unique $\bfy\in \R^k$ such that $|\bfy - \bfb|< r_1$ \begin{equation}\label{ImFT.eq1} \bfF(\bfx, \bfy) = \bf0. \end{equation} In other words, equation \eqref{ImFT.eq1} implicitly defines a function $\bfy = \bff(\bfx)$ for $\bfx\in \R^n$ near $\bfa$, with $\bfy = \bff(\bfx)$ close to $\bfb$. Note in particular that $\bff(\bfa) = \bfb$.

ii. Moreover, the function $\bff:B(r_0, \bfa)\to B(r_1,\bfb)\subset \R^k$ from part (i) above is of class $C^1$, and its derivatives may be determined by implicit differentiation.

Sketch of the Proof Assume that $\bfF$ satisfies the hypotheses of the Implicit Function Theorem, and define $\bfG:U\to \R^{n+k}$ by $$ G(\bfx, \bfy) = (\bfx, \bfF(\bfx, \bfy)). $$ Recalling that all vectors are column vectors by default, this means that \begin{equation}\label{G.def} G(\bfx, \bfy) = \left( \begin{array}{c} x_1\\ \vdots\\ x_n\\ F_1(\bfx, \bfy)\\ \vdots\\ F_k(\bfx, \bfy) \end{array} \right) . \end{equation}

Claim 1. $\det D\bfG(\bfa, \bfb) = \det D_y\bfF(\bfa, \bfb)$.

This is essentially a linear algebra exercise. Note that $$ D\bfG \ = \ \left( \begin{array}{ccccccc} 1&0&\cdots&0&0&\cdots &0\\ 0&1&\cdots&0&0&\cdots &0\\ \vdots&\vdots&\ddots&0&0&\cdots &0\\ 0&0&\cdots&1&0&\cdots &0\\ \partial_{x_1}F_1&\partial_2 F_1&\cdots&\partial_{x_n}F_1&\partial_{y_1}F_1&\cdots &\partial_{y_k}F_1 \\ \vdots &\vdots&\cdots&\vdots&\vdots&\cdots &\vdots\\ \partial_{x_1}F_k&\partial_2 F_k&\cdots&\partial_{x_n}F_k&\partial_{y_1}F_k&\cdots &\partial_{y_k}F_k \\ \end{array} \right) . $$

Here $D\bfG$ denotes the $(n+k)\times (n+k)$ matrix of derivatives of all components of $\bfG$ with respect to all variables $x_1,\ldots, x_n, y_1,\ldots, y_k$. If you are really good at linear algebra, you can stare at this monster and see why Claim 1 is true. If you want to write a detailed proof, induction on $n$ is an option.

Thus our assumption $ \det D_y\bfF(\bfa, \bfb)\ne 0$ implies that $\det D\bfG(\bfa, \bfb) \ne 0 $. Therefore according to the Inverse Function Theorem, there are open sets $M\subset U$ and $N\subset \R^{n+k}$ such that $(\bfa, \bfb)\in M$ and $\bfG:M\to N$ is invertible, with inverse of class $C^1$.

Next, for $\bfx$ such that $(\bfx, {\bf 0})\in N$, define $\bff(\bfx)$ by $$ \bfG^{-1}(\bfx, {\bf 0}) = (\bfx, \bff(\bfx)) = \left( \begin{array}{c} x_1\\ \vdots\\ x_n\\ f_1(\bfx) \\ \vdots\\ f_k(\bfx) \end{array} \right) . $$ This $\bff$ turns out to be the implicit function whose existence we are trying to prove. Its definition says that $$ \bfy = \bff(\bfx) \qquad \iff\qquad (\bfx,\bfy)\in M\mbox{ and }G(\bfx, \bfy) = (\bfx , {\bf 0}). $$ And of course $\bfG(\bfx, \bfy) = (\bfx, {\bf 0 })$ is equivalent to $\bfF(\bfx, \bfy) = \bf 0$.

Since $\bfG^{-1}$ is $C^1$, the same is true of $\bff$.

To complete the proof, it is still necessary to worry about a few details related to the choice of $r_0,r_1$ in the conclusion of the theorem, but what we have said above is the main point. $\quad \Box$

Problems

Basic skills

Let $$ \binom u v = \bff(x,y) = \binom{f_1(x,y)}{f_2(x,y)}, \quad\mbox{ where }\quad f_1(x,y) = y-\frac 1 3x^3 \mbox{ and } f_2(x,y) = ye^x. $$ The picture below shows level sets of $f_1$ (in blue) and $f_2$ (in red). Thus, the red and blue curves show points in the $xy$ plane that correspond to a regular grid of horizontal (red) and vertical (blue) lines in the $uv$ plane.

$drawing$ $\qquad\qquad$
(The red curves stop where they do because filling the whole region with them would be computationally demanding, but one can imagine how they should continue.)
- Compute the derivative $D\bff$ and its determinant, $\det D\bff$.
- Identify points in the picture where $\nabla f_1$ and $\nabla f_2$ are parallel to each other. (If you like, mentally fill in the first and fourth quadrants with more red level curves. You can also, if you like, mentally interpolate more red and/or blue level curves between the ones that are pictured.)
- Find all points $(x,y)$ where $\det D\bff = 0$.
- From looking at the level sets, determine whether $\bff$ is one-to-one.
Same questions as in the previous problem, for $$ \binom u v = \bff(x,y) = \binom{f_1(x,y)}{f_2(x,y)}, \quad\mbox{ where }\quad \begin{array}{rl} f_1(x,y) &= y+ 2 y^3 - xy^2+x^2y-x^3 \\ f_2(x,y) &= y+x^3. \end{array} $$ with some level sets of $f_1$ (blue) and $f_2$ (red) pictured below.
Hint: For part (c), it is a good idea to look at the sign of $\partial_i f_j$ for $i,j=1,2$ (that is, the components of the matrix $D\bff$.)

$drawing$ $\qquad\qquad$
Same questions as in the previous problems, except that this time you should also sketch a picture showing some level curves $u = $ constant and $v= $constant, for $$ \binom u v = \bff(x,y) = \binom{y-x^2}{\frac y{1+x^2}}, $$ You might like to use different colors for level sets of $u$ and $v$.
For the functions $r(x,y)$ and $\theta(x,y)$ defined in \eqref{rtheta}, check that they are both $C^1$ functions of $(x,y)$ everywhere in their domain, and compute all partial derivatives $$ \left( \begin{array}{cc} \partial_x r & \partial_y r\\ \partial_x \theta & \partial_y \theta \end{array}\right). $$ Do this by using the Inverse Function Theorem and the easily-differentiated expressions for $(x,y)$ as functions of $(r,\theta)$. You may accept it as true, without verifying in detail, that the functions defined in \eqref{rtheta} are indeed $\bff^{-1}(x,y)$, where $\bff(r,\theta)$ is defined in \eqref{pc}, \eqref{dr}.
Let $$ \bff(x,y) = \binom{e^x(y^2-3x+1)}{x\ln(y^2+1)+y}, $$ and note that $\bff(0,0) = (1,0)$
- Determine whether $\bff$ has a $C^1$ local inverse $\bff^{-1}$ near $(1,0)$, with $\bff^{-1}(1,0) = (0,0)$.
- If so, compute $D(\bff^{-1})(1,0)$.
Let $\bff:\R^3\to \R^3$ be the function that defines spherical coordinates, $$ \bff(r,\theta,\varphi)\ = \ \left( \begin{array}{c} r\cos\theta\sin\varphi\\ r\sin\theta\sin\varphi\\ r\cos\varphi \end{array} \right) = \left( \begin{array}{c} x\\y\\z \end{array} \right) . $$
- How can we restrict the domain of $\bff$ so that it is one-to one, and so that its image covers all, or almost all, of $\R^3$?
- Describe the surfaces in $\R^3$ that are images of planes $r=$ constant, $\theta=$ constant, $\varphi=$ constant. (These are level sets of the inverse function $(x,y,z)\mapsto (r,\theta,\varphi) = \bff^{-1}(x,y,z)$ .) If you like, you can restrict the domain of $\bff$ in a way that you have determined above.
- compute $D\bff$ and $\det D\bff$.
- If we consider the domain of $\bff$ to be all of $\R^3$, then find all points $(r,\theta, \varphi)$ at which $\det D\bff = \bf 0$. Also find the image of these points in $xyz$ space.

Transformations, and the Inverse Function Theorem