$\newcommand{\R}{\mathbb R }$ $\newcommand{\N}{\mathbb N }$ $\newcommand{\Z}{\mathbb Z }$ $\newcommand{\bfa}{\mathbf a}$ $\newcommand{\bfb}{\mathbf b}$ $\newcommand{\bfc}{\mathbf c}$ $\newcommand{\bft}{\mathbf t}$ $\newcommand{\bff}{\mathbf f}$ $\newcommand{\bfF}{\mathbf F}$ $\newcommand{\bfk}{\mathbf k}$ $\newcommand{\bfg}{\mathbf g}$ $\newcommand{\bfG}{\mathbf G}$ $\newcommand{\bfh}{\mathbf h}$ $\newcommand{\bfu}{\mathbf u}$ $\newcommand{\bfv}{\mathbf v}$ $\newcommand{\bfx}{\mathbf x}$ $\newcommand{\bfp}{\mathbf p}$ $\newcommand{\bfy}{\mathbf y}$ $\newcommand{\ep}{\varepsilon}$
In this section we are interested in functions $\bff:U\to V$, where $U$ and $V$ are open subsets of $\R^n$.
Such functions (which we will call transformations
) are best visualized using before and after
sketches, particularly when $n=2$.
We are especially interested in functions $\bff:U\to V$ as above, such that
The Inverse Function Theorem, discussed below, can help us to identify such functions (locally at least).
A good way to visualize a transformation $\bff$ in $2$ dimensions is to draw pairs of pictures showing
Thus, the left-hand picture is *before $\bff$
, and the
right-hand picture shows "after $\bff$.
Example 1.
For instance, the picture below on the right shows how the Cartesian grid (before
, on the left) is transformed by
the linear mapping $\bff(x,y) = \binom{2y-x}{x+y}$ (after
on the right).
The blue lines are transformed into into sloping lines parametrized by $\binom{2c-x}{x+c}$, where $c$ is constant and $x$ varies, and the red lines are transformed into sloping lines parametrized by $\binom{2y-c}{c+y}$, where $c$ is constant and $y$ varies.
Example 2. Below are two figures illustrating the function $\bff(x,y) = (x^2-y^2, 2xy)$. We will think of this as a mapping from the $x-y$ plane to the $u-v$ plane, where $u = x^2-y^2$ and $v=2xy$.
First:
On the left, a regular grid in the $x-y$ plane. (that is, the $x-y$ plane before applying $\bff$
On the right, its image in the $u-v$ plane,
where red curves are images of vertical lines and
blue curves are images of horizaontal lines. This is
what becomes of the $x-y$ plane after applying $\bff$
Second, as a different way of looking at the same function, we can also picture a regular grid in the $u-v$ plane (after, still on the right), and the correspoxnding curves in the $x-y$ plane (before, on the left).
Note that the blue curves on the left are exactly level sets of $u(x,y) = x^2-y^2$, corresponding to sets in the $u-v$ plane where $u$ is constant (the vertical lines blue on the right) and the red curves are level sets of $v(x,y) = 2xy$, corresponding to the horizontal red lines on the left.
An interesting feature of this function is that in both pictures, the curved lines appear always to meet at right angles.
(Since $\bff(x,y) = \bff(-x,-y)$, every point in the $u-v$ plane corresponds to two points in the $x-y$ plane. So if we wished, in both figures above, we could throw out half of the left-hand pictures, for example, the part where $x<0$, without changing the right-hand pictures.)
As we know, polar coordinates $(r,\theta)$ are related to cartesian coordinates $(x,y)$ by \begin{equation}\label{pc} \binom x y = \binom{r\cos\theta}{r\sin\theta} = \bff(r,\theta). \end{equation} For $\bff$ to be a bijection between open sets, we have to restrict its domain and range. For example, one comon choice (among many possibilities) is to specify that $\bff$ is a function $U\to V$ where \begin{equation}\label{dr} U = \{(r,\theta) : r>0, |\theta|<\pi\},\qquad V := \R^2 \setminus \{ (x,0) : x\le 0\}. \end{equation} In this case, $\bff^{-1}:V\to U$ exists and is of class $C^1$. This follows from the implicit function theorem (see below).
Above: on the left, the set $U = \{(r,\theta) : r>0, |\theta|<\pi\}$
in the $r-\theta$ plane. On the right, $\theta = \,$constant
lines (in blue) and $r = \,$constant
curves (in red) in the $x-y$ plane.
Spherical coordinates $(r,\theta,\varphi)$ are related to cartesian coordinates $(x,y,z)$ by $$ \left( \begin{array}{c} x\\y\\z \end{array} \right) \ = \ \left( \begin{array}{c} r\cos\theta\sin\varphi\\ r\sin\theta\sin\varphi\\ r\cos\varphi \end{array} \right) = \bff(r,\theta,\varphi). $$ See the practice problems.
As above, if we want $\bff$ to be a bijection between open sets $U$ and $V$, it is necessary to restrict the domain and range in some appropriate way.
Cylindrical coordinates $(r,\theta, z)$ are related to cartesian coordinates $(x,y,z)$ by $$ \left( \begin{array}{c} x\\y\\z \end{array} \right) \ = \ \left( \begin{array}{c} r\cos\theta\\ r\sin\theta\\ z\end{array} \right) = \bff(r,\theta,z). $$ These are very closely related to polar coordinates in $\R^2$.
The following theorem tells us when a transformation of class $C^1$ has a local inverse of class $C^1$.
Theorem 1: the Inverse Function Theorem Let $U$ and $V$ be open sets in $\R^n$, and assume that $\bff:U\to V$ is a mapping of class $C^1$.
Assume that $\bfa\in U$ is a point such that \begin{equation}\label{inv.hyp} D\bff(\bfa) \mbox{ is invertible}, \end{equation} and let $\bfb := \bff(\bfa)$. Then there exist open sets $M\subset U$ and $N\subset V$ such that
Moreover, if $\bfx\in M$ and $\bfy = \bff(\bfx)\in N$, then \begin{equation} D(\bff^{-1})(\bfy) = [D\bff(\bfx)]^{-1}. \label{Dfinv}\end{equation} In particular, $D(\bff^{-1})(\bfb ) = [D\bff(\bfa)]^{-1}$.
Remarks. Many things that we have said in Section 3.1 about the Implicit Function Theorem also apply, with some modifications, to the Inverse Function Theorem. For example:
by hand. It may be very difficult even to know if any solution exists. The Inverse Function Theorem says: if we know that $\bff(\bfa) = \bfb$, then for $\bfy$ near $\bfb$, the solvability of the nonlinear system can be established by considering a much easier question about linear algebra (ie, whether the matrix $D\bff(\bfa)$ is invertible).
For important and frequently-seen transformations, there are often explicit formulas for the inverse, so the Inverse Function Theorem, which guarantees the existence of an inverse without telling us what it is, may not seem very useful in these situations.
For example, this is the case for the transformation
$\bff:U\to V$ that defines polar coordinates,
see \eqref{pc} and \eqref{dr} above. For the $U$ and
$V$ that we have chosen,
we can write $(r,\theta) = \bff^{-1}(x,y)$ --- that is, we can
write $r$ and $\theta$ as functions of $x$ and $y$, in a way that
inverts
$\bff:U\to V$ --- as follows:
\begin{equation}\label{rtheta}
r = \sqrt{x^2+y^2}\mbox{ for all }(x,y)\in V,
\qquad
\theta =
\begin{cases}
{-\pi+\tan^{-1}(y/x)} &\mbox{ if }x<0\mbox{ and }y<0\\\
{-\frac\pi 2} &\mbox{ if }x=0\mbox{ and }y<0\\\
{\tan^{-1}(y/x)} &\mbox{ if }x>0\\
{\frac\pi 2} &\mbox{ if }x=0\mbox{ and }y>0\\
{\pi+\tan^{-1}(y/x)} &\mbox{ if }x<0\mbox{ and }y>0
\end{cases}
\end{equation}
(This complicated formula would be much simpler if
we had chosen $U = \{(r,\theta) : r>0, |\theta|<\pi/2\}$ and
$V = \{ (x,y) : x>0\}$. Then we can just write $\theta = \tan^{-1}(y/x)$.
This is often done, but it covers only half of
the $xy$ plane.)
The above formula can be found by elementary considerations, without using the Inverse Function Theorem. Is the Inverse Function Theorem unnecessary here?
Not entirely. The above formula is complicated, and it is hard to see, from looking at the formula, whether $\theta$ is a differentiable function of $(x,y)$ at points where $x=0$, and if so, what are $\partial_x\theta$ and $\partial_y\theta$. The easiest way to understand this is by using the Inverse Function Theorem, rather than trying to compute partial derivatives directly, starting from the complicated formula for $\theta(x,y)$.
You have actually already proved the core of the Inverse Function Theorem in Homeworks 2 and 3. By the end of Homework 3, you proved the following:
Theorem 2. Assume that $\Sigma$ is an open subset of $\R^n$ that contains the origin, and that $\bfF:\Sigma\to \R^n$ is a function of class $C^1$ such that $$ \bfF({\bf 0}) = {\bf 0}, \quad | D\bfF({\bf 0}) - I|< 1. $$ Then there exist $r_0, r_1>0$ such that $$ \mbox{ if }\bfc \in B(r_1, {\bf 0}), \quad\mbox{ then there is a unique }\bfx\in B(r_0, {\bf 0}) \mbox{ such that }\bfF(\bfx) = \bfc. $$
The idea of the proof of the Inverse Function Theorem is to reduce it to the situation studied in Theorem 2. This involves some messing around with details, but is easier than the proof of Theorem 2, which you have found by yourself.
Sketch of the proof. First, a preliminary technical step. To make the conclusion of Theorem 2 look more like that of the Inverse Function Theorem one can reformulate it slightly, to assert that there exist open sets $M_0, N_0\subset \R^n$, both containing the origin, such that $N_0$ is open, and $\bfF$ is a one-to-one map from $M_0$ onto $N_0$.
Now, to reduce the Inverse Function Theorem to the situation considered in Theorem 2, we want to start with $\bff$ satisfying the hypotheses of the Inverse Function Theorem, and modify it to get a function $\bfF$ satisfying the hypotheses of Theorem 2. To do this, let $A := D\bff(\bfa)$, and define \begin{equation}\label{F.def1} \bfF(\bfx) = A^{-1}\left[ \bff(\bfx + \bfa) - \bfb\right], \quad\mbox{domain of $\bfF :=$ } \{\bfx\in \R^n : \bfx+\bfa \in U\}. \end{equation} One can then check that $\bfF$ satisfies the hypotheses of Theorem 2, and by applying (the reformulated version of) Theorem 2 to $\bfF$, one can deduce that there exist sets $M\subset U$ and $N\subset V$ such that $\bff$ is one-to-one from $M$ onto $N$. Hence $\bff:M\to N$ is invertible.
The differentiability of $\bff^{-1}$ is more difficult to establish. The proof, which we will omit, also establishes the validity of formula \eqref{Dfinv} for $D(\bff^{-1})$. Alternately, if we already somehow know that $\bff^{-1}$ is differentiable, we can use the chain rule to check that \eqref{Dfinv} holds. (In fact you have done this in Test 2.) $\quad \Box$
Finally, we discuss the proof of the Implicit Function Theorem.
Our idea will be to reduce it to the Inverse Function Theorem
First, let's recall the statement of the Theorem:
Implicit Function Theorem. Assume that $S$ is an open subset of $\R^{n+k}$ and that $\bfF:S\to \R^k$ is a function of class $C^1$. Assume also that $(\bfa, \bfb)$ is a point in $S$ such that $$ \bfF(\bfa, \bfb) = {\bf 0} \qquad\mbox{ and } \qquad \det D_\bfy \bfF(\bfa, \bfb) \ne 0. $$
i. Then there exist $r_0,r_1>0$ such that for every $\bfx\in \R^n$ such that $|\bfx-\bfa|< r_0$, there exists a unique $\bfy\in \R^k$ such that $|\bfy - \bfb|< r_1$ \begin{equation}\label{ImFT.eq1} \bfF(\bfx, \bfy) = \bf0. \end{equation} In other words, equation \eqref{ImFT.eq1} implicitly defines a function $\bfy = \bff(\bfx)$ for $\bfx\in \R^n$ near $\bfa$, with $\bfy = \bff(\bfx)$ close to $\bfb$. Note in particular that $\bff(\bfa) = \bfb$.
ii. Moreover, the function $\bff:B(r_0, \bfa)\to B(r_1,\bfb)\subset \R^k$ from part (i) above is of class $C^1$, and its derivatives may be determined by implicit differentiation.
Sketch of the Proof Assume that $\bfF$ satisfies the hypotheses of the Implicit Function Theorem, and define $\bfG:U\to \R^{n+k}$ by $$ G(\bfx, \bfy) = (\bfx, \bfF(\bfx, \bfy)). $$ Recalling that all vectors are column vectors by default, this means that \begin{equation}\label{G.def} G(\bfx, \bfy) = \left( \begin{array}{c} x_1\\ \vdots\\ x_n\\ F_1(\bfx, \bfy)\\ \vdots\\ F_k(\bfx, \bfy) \end{array} \right) . \end{equation}
Claim 1. $\det D\bfG(\bfa, \bfb) = \det D_y\bfF(\bfa, \bfb)$.
This is essentially a linear algebra exercise. Note that $$ D\bfG \ = \ \left( \begin{array}{ccccccc} 1&0&\cdots&0&0&\cdots &0\\ 0&1&\cdots&0&0&\cdots &0\\ \vdots&\vdots&\ddots&0&0&\cdots &0\\ 0&0&\cdots&1&0&\cdots &0\\ \partial_{x_1}F_1&\partial_2 F_1&\cdots&\partial_{x_n}F_1&\partial_{y_1}F_1&\cdots &\partial_{y_k}F_1 \\ \vdots &\vdots&\cdots&\vdots&\vdots&\cdots &\vdots\\ \partial_{x_1}F_k&\partial_2 F_k&\cdots&\partial_{x_n}F_k&\partial_{y_1}F_k&\cdots &\partial_{y_k}F_k \\ \end{array} \right) . $$
Here $D\bfG$ denotes the $(n+k)\times (n+k)$ matrix of derivatives of all components of $\bfG$ with respect to all variables $x_1,\ldots, x_n, y_1,\ldots, y_k$. If you are really good at linear algebra, you can stare at this monster and see why Claim 1 is true. If you want to write a detailed proof, induction on $n$ is an option.
Thus our assumption $ \det D_y\bfF(\bfa, \bfb)\ne 0$ implies that $\det D\bfG(\bfa, \bfb) \ne 0 $. Therefore according to the Inverse Function Theorem, there are open sets $M\subset U$ and $N\subset \R^{n+k}$ such that $(\bfa, \bfb)\in M$ and $\bfG:M\to N$ is invertible, with inverse of class $C^1$.
Next, for $\bfx$ such that $(\bfx, {\bf 0})\in N$, define $\bff(\bfx)$ by $$ \bfG^{-1}(\bfx, {\bf 0}) = (\bfx, \bff(\bfx)) = \left( \begin{array}{c} x_1\\ \vdots\\ x_n\\ f_1(\bfx) \\ \vdots\\ f_k(\bfx) \end{array} \right) . $$ This $\bff$ turns out to be the implicit function whose existence we are trying to prove. Its definition says that $$ \bfy = \bff(\bfx) \qquad \iff\qquad (\bfx,\bfy)\in M\mbox{ and }G(\bfx, \bfy) = (\bfx , {\bf 0}). $$ And of course $\bfG(\bfx, \bfy) = (\bfx, {\bf 0 })$ is equivalent to $\bfF(\bfx, \bfy) = \bf 0$.
Since $\bfG^{-1}$ is $C^1$, the same is true of $\bff$.
To complete the proof, it is still necessary to worry about a few details related to the choice of $r_0,r_1$ in the conclusion of the theorem, but what we have said above is the main point. $\quad \Box$
Let
$$
\binom u v = \bff(x,y) = \binom{f_1(x,y)}{f_2(x,y)},
\quad\mbox{ where }\quad f_1(x,y) = y-\frac 1 3x^3 \mbox{ and } f_2(x,y) = ye^x.
$$
The picture below
shows level sets of $f_1$ (in blue) and $f_2$ (in red).
Thus, the red and blue curves show points in the $xy$ plane that
correspond to a regular grid of horizontal (red) and vertical (blue) lines in the $uv$ plane.
Same questions as in the previous problem, for
$$
\binom u v = \bff(x,y) = \binom{f_1(x,y)}{f_2(x,y)},
\quad\mbox{ where }\quad
\begin{array}{rl}
f_1(x,y) &= y+ 2 y^3 - xy^2+x^2y-x^3 \\
f_2(x,y) &= y+x^3.
\end{array}
$$
with some level sets of $f_1$ (blue) and $f_2$ (red) pictured below.
Hint: For part (c), it is a good idea to look at the sign of $\partial_i f_j$
for $i,j=1,2$ (that is, the components of the matrix $D\bff$.)
Same questions as in the previous problems, except that this time you should also sketch a picture showing some level curves $u = $ constant and $v= $constant, for $$ \binom u v = \bff(x,y) = \binom{y-x^2}{\frac y{1+x^2}}, $$ You might like to use different colors for level sets of $u$ and $v$.
For the functions $r(x,y)$ and $\theta(x,y)$ defined in \eqref{rtheta}, check that they are both $C^1$ functions of $(x,y)$ everywhere in their domain, and compute all partial derivatives $$ \left( \begin{array}{cc} \partial_x r & \partial_y r\\ \partial_x \theta & \partial_y \theta \end{array}\right). $$ Do this by using the Inverse Function Theorem and the easily-differentiated expressions for $(x,y)$ as functions of $(r,\theta)$. You may accept it as true, without verifying in detail, that the functions defined in \eqref{rtheta} are indeed $\bff^{-1}(x,y)$, where $\bff(r,\theta)$ is defined in \eqref{pc}, \eqref{dr}.
Let $$ \bff(x,y) = \binom{e^x(y^2-3x+1)}{x\ln(y^2+1)+y}, $$ and note that $\bff(0,0) = (1,0)$
Let $\bff:\R^3\to \R^3$ be the function that defines spherical coordinates, $$ \bff(r,\theta,\varphi)\ = \ \left( \begin{array}{c} r\cos\theta\sin\varphi\\ r\sin\theta\sin\varphi\\ r\cos\varphi \end{array} \right) = \left( \begin{array}{c} x\\y\\z \end{array} \right) . $$
How can we restrict the domain of $\bff$ so that it is one-to one, and so that its image covers all, or almost all, of $\R^3$?
Describe the surfaces in $\R^3$ that are images of planes $r=$ constant, $\theta=$ constant, $\varphi=$ constant. (These are level sets of the inverse function $(x,y,z)\mapsto (r,\theta,\varphi) = \bff^{-1}(x,y,z)$ .) If you like, you can restrict the domain of $\bff$ in a way that you have determined above.
compute $D\bff$ and $\det D\bff$.
If we consider the domain of $\bff$ to be all of $\R^3$, then find all points $(r,\theta, \varphi)$ at which $\det D\bff = \bf 0$. Also find the image of these points in $xyz$ space.
Please make sure that you master the Basic Skills before looking at these.
Above, we have sketched a proof showing that the Implicit Function Theorem can be deduced from the Inverse Function Theorem.
Perhaps weirdly, it is also true that the Inverse Function Theorem can be deduced from the Implicit Function Theorem. Do this; that is, assume that you know the Implicit Function Theorem is true, and use it to prove the Inverse Function Theorem.
Hint.
The Inverse Function Theorem asks about the possibility of solving equations of the form $\bff(\bfx)= \bfy$ for $\bfx$ as a function of $\bfy$.
The Implicit Function Theorem guarantees that under certain hypotheses, you can solve equations of the form $\bfF(\bfx, \bfy) ={\bf 0}$ for $\bfy$ as a function of $\bfx$. (Note, if you want, you could swap the roles of $\bfx$ and $\bfy$ and solve $\bfF(\bfy, \bfx) = \bf 0 $ for $\bfx$ as a function of $\bfy$.)
What this suggests is: rewrite the equation $\bff(\bfx) = \bfy$ in the form $\bfF(\bfy, \bfx) = \bf0$ for a suitable function $\bfF$. You may be able to use the Implicit Function Theorem to say something about solvability.
Fill in some details in the proof of the Inverse Function Theorem as sketched above. For example, verify that the function $\bfF$ defined in \eqref{F.def1} satisfies the hypotheses of Theorem 2.
Fill in some details in the proof of the Implicit Function Theorem as sketched above. For example, verify that the function $\bfG$ define in \eqref{G.def} satisfies
$\det D\bfG(\bfa, \bfb) = \det D_y\bfF(\bfa, \bfb)$, as we have claimed.