Linear systems - Part A

There is only one 1-dimensional linear system, namely

\dot{x} = a x,

where a \in \R. The general solution is given by

x(t) = C e^{at}.

A general 2-dimensional linear system, that is, one that both \dot{x}_1 and \dot{x}_2 are linear function of x_1 and x_2 (no constant term!), can be written as

\begin{cases} \dot{x}_1 = a x_1 + b x_2, \\ \dot{x}_2 = c x_1 + d x_2. \end{cases}

This can be conveniently written in the matrix form, that is

\dot{\bx} = \bmat{\dot{x}_1 \\ \dot{x}_2} = \bmat{a_{11} & a_{12} \\ a_{21} & a_{22}} \bmat{x_1 \\ x_2} = A \bx.

The matrix form for linear systems makes sense for any dimension.

Exercise 0.1.

How many equilibria does a linear system have? Can you describe them?

1. Two dimensional linear systems, the diagonalizable case

The simplest two-dimensional system is just two one-dimensional systems. That is

\dot{x}_1 = a x_1, \quad \dot{x}_2 = d x_2,\qquad (1.1)

or A = \bmat{a & 0 \\ 0 & d}. Since the equations of x_1 and x_2 do not affect each other (we say the variables are separated, or the system is decoupled), the solution is easy to find:

x_1(t) = C_1 e^{at}, \quad x_2(t) = C_2 e^{dt}.\qquad (1.2)

Even in this simple case, it is helpful to understand how the system depends on the parameters a and d (recall principle 4 of ODE) in the following cases:

a < d < 0.
a = d < 0.
a < d = 0.
a < 0 < d.
a = 0 < d.
0 < a = d.
0 < a < d.

Remark 1.1.

We only consider the case a \le d, because if a \ge d we can switch x_1 and x_2 to get the same picture. When we can specialize because the general case can be easily reduced to the special case, we often say "without loss of generality".

1.1. The node

If a < d < 0, we have as t \to \infty:

x_1(t) = C_1 e^{at} \to 0, \quad x_2(t) = C_2 e^{dt} \to 0, \quad \frac{x_1(t)}{x_2(t)} = \frac{C_2}{C_1} e^{(a-d)t} \to 0.

Therefore, any trajectory (x_1(t), x_2(t)) approaches 0 as t \to \infty, but it will lean towards the x_2 axis (as x_1 decreases much faster than x_2).

Figure 1.

This picture is called a stable node. The word stable comes from the observation that (0, 0) is a stable equilibrium.

The case a = d is special since

\frac{x_1(t)}{x_2(t)} = \frac{C_1}{C_1}

is a constant. All trajectories are straight lines. This is sometimes called a stable star.

Remark 1.2.

If we think of a and d as separate dials one can turn, then the case a = d is a coincidence. This condition is fragile as a small turn in either dial will invalidate this condition. This type of condition is sometimes called degenerate.

On the other hand, if a < d, the condition is still satisfied if a and d are changed slightly. This is an open condition.

For the case 0 < a < d or 0 < a = d, we use a common trick called "time reversal". Recall that

\dot{x}_1 = a x, \quad \dot{x}_2 = dx.

Consider the new functions

y_1(t) = x_1(-t), \quad y_2(t) = x_2(-t)

then

\begin{aligned} \frac{dy_1}{dt}(t) & = \frac{d}{dt}\left( x_1(-t)\right) = (-1) \frac{dx_1}{dt}(-t) \\ & = \text{ (using the equation) } (-1) (a x_1(-t)) = - a x_1(-t) = - a y_1(t). \end{aligned}

Do the same to the second equation, we get

\dot{y}_1 = - ay_1, \quad \dot{y}_2 = - d y_2.

Since -a, -d > 0, the system y_1, y_2 is a stable node. The system for x_1, x_2 is the time reversal of a stable node, called the unstable node.

1.2. The dengerate case

Suppose a < d = 0, the solution is simply

x_1(t) = C_1 e^{at}, \quad x_2(t) = C_2.

All solutions converge to the x_2 axis along x_2 = C_2. Every single point on the x_2 axis is an equilibrium.

Figure 2.

1.3. The saddle

Suppose a < 0 < d. We have as t \to \infty:

x_1(t) \to 0, \quad x_2(t) \to \pm \infty \text{ depending on the sign of } C_2,

and as t \to -\infty:

x_1(t) \to \pm \infty \text{ depending on the sign of } C_1, \quad x_2(t) \to 0.

Figure 3.

1.4. The diagonalizable case

Consider the general system \dot{\bx} = A\bx, and suppose that A has two linearly independent eigenvectors \bv_1 and \bv_2, i.e.

A \bv_1 = \lambda_1 \bv_1, \quad A \bv_2 = \lambda_2 \bv_2,\qquad (1.3)

\lambda_1, \lambda_2 \in \R. We will show that there exists solutions \bx(t) that is always parallel to \bv_1 or \bv_2.

Consider the function

\bw(t) = f(t) \bv_1 \st \R \to \R^2,

where f(t) is a scalor function. We check under which condition \dot{\bw}(t) = A\bw(t) (i.e. solve the equation). Indeed, the left hand side is

\dot{\bw} = \frac{d}{dt} (f(t) \bv_1) = \dot{f}(t) \bv_1,

while the right hand side is

A \bw(t) = A(f(t) \bv_1) = f(t) A \bv_1 = \lambda f(t) \bv_1.

Since \bv_1 \ne 0 (eigenvectors are nonzero by definition!)

\dot{\bw}(t) = A\bw(t) \quad \iff \quad \dot{f}(t) = \lambda_1 f(t).

The latter equation is easy to solve: f(t) = C_1 e^{\lambda_1 t}. We obtain:

C_1 e^{\lambda_1 t} \bv_1

is a solution to \dot{x} = A \bx. The same calculation works for \lambda_2 and \bv_2. To get all the solutions, we use the following result of linear systems:

Proposition 1.3.

If \bv(t), \bw(t) solves the linear equation \dot{\bx} = A \bx, then so does any linear combination

C_1 \bv_1(t) + C_2 \bv_2(t).

Corollary 1.4.

Suppose (1.3) holds, then the general solution to \dot{\bx} = A \bx is given by

C_1 e^{\lambda_1 t} \bv_1 + C_2 e^{\lambda_2 t} \bv_2.\qquad (1.4)

The phase portrait in this case is very similar to the separable case (1.1), except that the axes are skewed: the role of x_1 and x_2 axes were replaced by the line along \bv_1 and \bv_2.

On the other hand, the equation (1.1) is a special case of (1.3), since the eigenvectors in (1.1) are simply

\bv_1 = \bmat{1 \\ 0}, \quad \bv_1 = \bmat{0 \\ 1}.

1.5. The fundamental set of solutions

The general solution (1.4) can also be written in the matrix form

\bmat{x_1(t) \\ x_2(t)} = \bmat{e^{\lambda_1 t} \bv_1 & e^{\lambda_2 t} \bv_2} \bmat{C_1 \\ C_2} = \bmat{\bv_1 & \bv_2} \bmat{e^{\lambda_1 t} & 0 \\ 0 & e^{\lambda_2 t}} \bmat{C_1 \\ C_2}.

By plugging in t = 0, we have

\bx(0) = \bmat{x_1(0) \\ x_2(0)} = \bmat{\bv_1 & \bv_2} \bmat{C_1 \\ C_2},

or (write P = \bmat{\bv_1 & \bv_2})

\bmat{C_1 \\ C_2} = \bmat{\bv_1 & \bv_2}^{-1} \bx(0) = P^{-1} \bx(0).

Finally we obtain the formula

\bmat{x_1(t) \\ x_2(t)} = P \bmat{e^{\lambda_1 t} & 0 \\ 0 & e^{\lambda_2 t}} P^{-1} \bx(0) = \bfM(t) \bx(0).\qquad (1.5)

The matrix \bfM(t) is called the matrix solution of the linear ODE.

In analogy with the solution

x(t) = e^{a t}x(0)

of the ODE

\dot{x} = ax,

the we call the matrix solution \bfM(t) = e^{t A}, so that the formula

\bx(t) = e^{t A} \bx(0)

is still valid. In fact, this is more than an analogy. Define

e^{tA} = I + t A + \frac{1}{2} t^2 A^2 + \cdots + \frac{1}{n!} t^n A^n + \cdots,

Theorem 1.5.

Suppse

A = P \bmat{\lambda_1 & 0 \\ 0 & \lambda_2} P^{-1},

where P = \bmat{\bv_1 & \bv_2} are the matrix formed by linearly independent eigenvectors. Then

e^{tA} = P \bmat{e^{\lambda_1 t} & 0 \\ 0 & e^{\lambda_2 t}} P^{-1}.

Proof.

Denote

D = \bmat{\lambda_1 & 0 \\ 0 & \lambda_2},

we first compute that

I + tD + \frac{t^2}{2!}D^2 + \cdots = \bmat{1 + t\lambda_1 + \frac{t^2}{2!}\lambda_1^2 + \cdots & 0 \\ 0 & 1 + t\lambda_2 + \frac{t^2}{2!}\lambda_2^2 + \cdots } = \bmat{e^{\lambda_1 t} & 0 \\ 0 & e^{\lambda_2 t}}.

Then A = P D P^{-1}. Observe that

A^2 = P D P^{-1} P D P^{-1} = P D^2 P^{-1}, \cdots , A^n = P D^n P^{-1}, \cdots.

This means

\begin{aligned} & \quad I + t A + \frac{t^2}{2!} A^2 + \cdots = P I P^{-1} + t PD^2P^{-1} + \frac{t^2}{2!} PD^2P^{-1} + \cdots \\ & = P \left( I + tD + \frac{t^2}{2!} D^2 + \cdots \right) P^{-1} = P \bmat{e^{\lambda_1 t} & 0 \\ 0 & e^{\lambda_2 t}}P^{-1}. \end{aligned}

2. Two dimensional system, complex eigenvalues

2.1. Complex eigenvectors of real matrices

Existence of eigenvalue and eigenvectors is an algebraic fact: it doesn't matter what number system you are using. If

A = \bmat{a & b \\ c & d}, \quad a, b, c, d \in \C

is a complex matrix, then you can solve for a complex eigenvector from the equation

A \bz = \lambda \bz, \quad \lambda \in \C, \, \bz \in \C^2.

However, what we are interested usually is the complex eigenvectors/eigenvalues of a real matrix, i.e.

A = \bmat{a & b \\ c & d}, \quad a, b, c, d \in \R.

Linear algebra tells us that if \lambda is an eigenvector, then

\det(A - \lambda I) = \det \bmat{a - \lambda & b \\ c & d - \lambda} = \lambda^2 - (a + d) \lambda + ad - bc = 0.\qquad (2.1)

Equation (2.1) is a real coefficient equation, hence it's solutions (when complex) must come in conjugate pairs:

\lambda, \bar{\lambda} = \alpha \pm i \beta.

(This is clear from the quadratic formula). In particular, if \lambda is not real (\beta \ne 0), then the two eigenvalues are different.

2.2. Linear ODEs with complex eigenvalues

Consider now the ODE

\dot{\bz} = A \bz,\qquad (2.2)

such that

\bz = \bz(t) \in \C^2

is complex, but both time t and the matrix A remain real. Write

\bz(t) = \bx(t) + i \by(t), \quad \bx, \by \in \R^2,

then

\dot\bz = \dot{\bx} + i \dot{\by}, \quad A \bx = A \bx + i A \by.

Compare the real the imaginary part, then

\dot\bx = A \bx, \quad \dot{\by} = A \by

with both \bx(t), \by(t) \in \R^2 (real!). To summarize:

Theorem 2.1.

If A is real and \bz(t) \in \C^2 is a complex solution of (2.3), then the real and imaginary part of \bz(t) are real solutions to (2.3).

We now assume that A has a complex eigenvalue \lambda = \alpha + i \beta and a complex eigenvector

\bv + i \bw, \quad \bv, \bw \in \R^2.

The analysis performed in section 1.4 still works, so

\bz(t) = e^{(\alpha + i \beta) t} (\bv + i \bw)

is a solution of (2.3). Now we expand:

\begin{aligned} \bz(t) & = e^{\alpha t} (\cos (\beta t) + \sin(\beta t)) \\ & = e^{\alpha t} \left( \cos(\beta t) \bv - \sin(\beta t) \bw\right) + i e^{\alpha t} \left( \sin (\beta t) \bv + \cos(\beta t) \bw\right). \end{aligned}

The real part and imaginary part are both solutions:

\bx_1 = e^{\alpha t} \left( \cos(\beta t) \bv - \sin(\beta t) \bw\right), \quad \bx_2 = e^{\alpha t} \left( \sin (\alpha t) \bv + \cos(\beta t) \bw\right).\qquad (2.3)

We can take linear combinations to get the general solution:

e^{\alpha t} \left( (C_1 \cos(\beta t) - C_2 \sin (\alpha t)) \bv + (C_1 \sin(\beta t) + C_2 \cos(\beta t)) \bw \right).

We also have the matrix solution:

Proposition 2.2.

For 2\times 2 matrix A, suppose eigenvalues are \alpha \pm \beta i, and eigenvectors are \bv \pm i \bw, then the matrix solution M(t) is

P \bmat{e^{\alpha t} \cos \beta t & - e^{\alpha t}\sin \beta t \\ e^{\alpha t} \sin \beta t & e^{\alpha t} \cos \beta t } P^{-1}, \quad \text{ where } P = \bmat{\bv & \bw}.

2.3. Classifying the phase portrait

To understand the solution, let's take only \bx_1 from (2.3) and also the special case that \bv = (1, 0), \bw = (0, 1):

\bx_1 = e^{\alpha t} \left( \cos(\beta t) \bmat{1 \\ 0} - \sin(\beta t) \bmat{ 0 \\ 1}\right) = e^{\alpha t}\bmat{\cos(\beta t) \\ - \sin(\beta t)}.

The vector part is a rotation on the unit circle. In the general case, the function

\cos(\beta t) \bv - \sin(\beta t) \bw

produces a "rotation" along an ellipse.

When \alpha = 0, the solution is a rotation along an ellipse. (Elliptic centre).
When \alpha > 0, the solution spirals to infinity in positive time (Unstable focus).
When \alpha < 0, the solution spirals to 0 in positive time (Stable focus). This is also the time-reversal of the \alpha > 0 case.

Figure 4. Elliptic centre

Figure 5. A stable spiral

3. The Jordan block

We discussed the case where A has two real independent eigenvectors, or where A has no real eigenvector (but has two complex eigenvectors). The only remaining case is when A has only one real independent eigenvector. The prototypical example is

A = \bmat{\lambda & 1 \\ 0 & \lambda},

called a Jordan block. We check that the equation

(A - \lambda I) \bx = 0

has only one independent solution parallel to

\bv = \bmat{1 \\ 0}.

Luckily, the equation \dot{\bx} = A \bx is still solvable by standard techniques. Indeed, the equation reads

\dot{x}_1 = \lambda x_1 + x_2, \quad \dot{x}_2 = \lambda x_2.\qquad (3.1)

Compare to (1.1), we note that the equation of x_1 involves x_2, but equation of x_2 does not involve x_1! We can solve for x_2 by itself; once x_2 is known as a function, we can plug into the first equation to solve for x_1. This type of relation between variables is called a skew product in dynamical system theory.

Let solve (3.1) using this idea. From the second equation,

x_2(t) = x_2(0) e^{\lambda t}.

Plug into the first equation, we get

\dot{x}_1 = \lambda x_1 + x_2(0) e^{\lambda t}.

This is a first order inhomogeneous equation that can be solved using an integrating factor. Move the \lambda x_1 term to the left, and multiply by the integrating factor e^{-\lambda t}, we get

e^{-\lambda t} \dot{x}_1 - \lambda e^{-\lambda t} x_1 = \frac{d}{dt} \left( e^{-\lambda t} x_1(t)\right) = x_2(0).

Integrate from 0 to t and simplify to get (do the calculations yourself, at least once!)

x_1(t) = x_1(0) e^{\lambda t} + x_2(0) t e^{\lambda t}.

Put back into vector form, we have

\bmat{x_1(t) \\ x_2(t)} = \bmat{x_1(0) e^{\lambda t} + & x_2(0) t e^{\lambda t} \\ & x_2(0) e^{\lambda t}} = x_1(0) e^{\lambda t} \bmat{1 \\ 0} + x_2(0) e^{\lambda t} \bmat{t \\ 1}.

Note that this is also the general solution with x_1(0), x_2(0) playing the role of the constants C_1, C_2.

Let's discuss the phase portrait:

(Degenerate stable node) If \lambda < 0, then x(t) \to 0 as t \to \infty. However, since
t^{-1} e^{-\lambda t} x(t) = \bmat{t^{-1} x_1(0) + x_2(0) \\ t^{-1} x_2(0)} \to \bmat{x_2(0) \\ 0}, \quad t \to \infty,
as t \to \infty, the direction of all solutions will turn towards \bmat{x_2(0) \\ 0}. Note that the direction of this vector depends on the sign of x_2(0).
- If x_2(0) > 0, that is, the solution started in the upper half plane, then the solution will turn to 0 along the positive x_1 axis.
- If x_2(0) < 0, that is, the solution started in the lower half plane, then the solution will turn to 0 along the negative x_1 axis.
(Degenerate unstable node) If \lambda > 0, we have a similar analysis with t \to -\infty.
If \lambda = 0, x_2 is a constant, and x_1= x_1(0) + t x_2(0) is a linear function of t. All trajectories are horizontal lines, and orbits move at a constant speed to infinity.

Figure 6.

Let now assume that

A \bv = \lambda \bv

but there isn't another linearly independent vector with this property. It turns out there always exists a generalized eigenvector\bw solving the equation

(A - \lambda I) \bw = \bv, \quad \text{ or } \quad A \bw = \lambda \bw + \bv.

Let us try to find a solution to \dot{\bx} = A \bx of the form

\bx(t) = f(t) \bv + g(t) \bw.

We have

\begin{aligned} \dot{\bx} & = \dot{f} \bv + \dot{g} \bw, \\ A \bx & = f(t) A \bv + g(t) A \bw = f(t) \lambda \bv + g(t) (\lambda \bw + \bv) \\ & = (\lambda f(t) + g(t)) \bv + \lambda g(t) \bw \end{aligned}

The equation \dot{\bx} = A \bx is equivalent to

\dot{f} = \lambda f + g, \quad \dot{g} = \lambda g,

identical to (3.1)! As a result

f(t) = C_1 e^{\lambda t} + C_2 t e^{\lambda t}, \quad g(t) = C_2 e^{\lambda t}.

The general solution when there is only one eigenvector is

\bx(t) = C_1 e^{\lambda t} \bv + C_2 e^{\lambda t} \left(t \bv + \bw \right).\qquad (3.2)

We can also solve for the matrix solution similarly to what we had before:

Proposition 3.1.

Suppose the 2\times 2 matrix A has repeated eigenvalue \lambda, and \bv, \bw are linearly independent vectors that satisfies

A \bv = \lambda \bv, \quad (A - \lambda I) \bw = \bv,

then the matrix solution is

M(t) = P \bmat{e^{\lambda t} & t e^{\lambda t} \\ 0 & e^{\lambda t}} P^{-1}, \quad P = \bmat{\bv & \bw}.

Proof.

Rewrite (3.2) in the matrix form:

\bx(t) = \bmat{\bv & \bw} \bmat{e^{\lambda t} & t e^{\lambda t} \\ 0 & e^{\lambda t}} \bmat{C_1 \\ C_2}.

Plug in t = 0, we get

\bx(0) = P \bmat{C_1 \\ C_2}, \text{ hence } \bmat{C_1 \\ C_2} = P^{-1} \bx(0).

We get

\bx(t) = P \bmat{e^{\lambda t} & t e^{\lambda t} \\ 0 & e^{\lambda t}} P^{-1} \bx(0).

Table of contents