10.1. Functionals: definitions

$\renewcommand{\Re}{\operatorname{Re}}$ $\renewcommand{\Im}{\operatorname{Im}}$ $\newcommand{\erf}{\operatorname{erf}}$ $\newcommand{\dag}{\dagger}$ $\newcommand{\const}{\mathrm{const}}$ $\newcommand{\arcsinh}{\operatorname{arcsinh}}$

Chapter 10. Variational methods

10.1. Functionals, extremums and variations

Functionals: definitions

Variations of functionals

Stationary points of functionals

Extremums of functionals

Functionals: definitions

Definition 1. Functional is a map from some space of functions (or subset in the space of functions) $\mathsf{H}$ to $\mathbb{R}$ (or $\mathbb{C}$): \begin{equation} \Phi: \mathsf{H}\ni u\to \Phi[u]\in \mathbb{R}. \label{eq-10.1.1} \end{equation}

Remark 1. Important that we consider a whole function as an argument, not its value at some particular point!

Example 1.

On the space $C(I)$ of continuous functions on the closed interval $I$ consider functional $\Phi[u]=u(a)$ where $a\in I$ (value at the point);
On $C(I)$ consider functionals $\Phi[u]=\max_{x\in I} u(x)$, $\Phi[u]=\min_{x\in I} u(x)$ and $\Phi[u]=\max_{x\in I} |u(x)|$, $\Phi[u]=\min_{x\in I} |u(x)|$;
Consider $\Phi[u]=\int_I f(x) u(x)\,dx$ where $f(x)$ is some fixed function.
On the space $C^1(I)$ of continuous and continuously differentiable functions on the closed interval $I$ consider functional $\Phi[u]=u'(a)$.

Definition 2.

Sum of functionals $\Phi_1+\Phi_2$ is defined as $(\Phi_1+\Phi_2)[u]=\Phi_1[u]+\Phi_2[u]$;
Product of functional by a number: $\lambda \Phi$ is defined as $(\lambda \Phi)[u]=\lambda (\Phi[u])$;
Function of functionals: $F(\Phi_1,\ldots,\Phi_s)$ is defined as $F(\Phi_1,\ldots,\Phi_s)[u]=F (\Phi_1 [u],\ldots,\Phi_s[u])$.

Definition 3. Functional $\Phi[u]$ is called linear if \begin{gather} \Phi [u+v]= \Phi[u]+\Phi [v],\label{eq-10.1.2}\\ \Phi [\lambda u]= \lambda \Phi[u] \label{eq-10.1.3} \end{gather} for all functions $u$ and scalars $\lambda$.

Remark 2. Linear functionals will be crucial in the definition of distributions later.

Exercise 1. Which functionals of Example 1. are linear?

Variations of functionals: 1-variable

We start from the classical variational problems: a single real valued function $q(t)$ of $t\in [t_0,t_1]$, then consider vector-valued function. This would lead us to ODEs (or their systems), and rightfully belongs to advanced ODE course.

Let us consider functional \begin{equation} S[q]= \int_{I} L(q(t),\dot{q}(t),t)\,dt \label{eq-10.1.4} \end{equation} where in traditions of Lagrangian mechanics we interpret $t\in I=[t_0,t_1]$ as a time, $q(t)$ as a coordinate, and $\dot{q}(t):=q'_t(t)$ as a velocity.

Let us consider $q+\delta q$ where $\delta q $ is a "small" function. We do not formalize this notion, just $\delta q=\varepsilon \varphi$ with fixed $\varphi$ and $\varepsilon\to 0$ is considered to be small. We call $\delta q$ variation of $q$ and important is that we change a function as a whole object. Let us consider

\begin{multline} \delta S:=S[q+\delta q]-S[q]= \int_I\Bigl(L(q+\delta q,\dot{q} + \delta \dot{q},t) -L(q,\dot{q},t)\Bigr)\,dt\\ \approx \int_I \Bigl(\frac{\partial L}{\partial q}\delta q + \frac{\partial L}{\partial \dot{q}}\delta \dot{q}\Bigr)\,dt\qquad \label{eq-10.1.5} \end{multline} where we calculated the linear part of expression in the parenthesis; if $\delta q=\varepsilon \varphi$ and all functions are sufficiently smooth then $\approx$ would mean "equal modulo $o(\varepsilon)$ as $\varepsilon\to 0$".

Definition 4.

Function $L$ we call Lagrangian.
The right-hand expression of (\ref{eq-10.1.5}) which is a linear functional with respect to $\delta q$ we call variation of functional $S$.

Assumption 1. All functions are sufficiently smooth.

Under this assumption, we can integrate the right-hand expression of (\ref{eq-10.1.5}) by parts: \begin{multline} \delta S:= \int_I \Bigl(\frac{\partial L}{\partial q}\delta q + \frac{\partial L}{\partial \dot{q}}\delta \dot{q}\Bigr)\,dt\\ = \int_I\Bigl(\frac{\partial L}{\partial q} - \frac{d}{dt} \frac{\partial L}{\partial \dot{q}} \Bigr)\delta q \,dt - \Bigl( \frac{\partial L}{\partial \dot{q}} \Bigr)\delta q \Bigr|_{t=t_0}^{t=t_1}.\qquad \label{eq-10.1.6} \end{multline}

Stationary points of functionals

Definition 5. If $\delta S=0$ for all admissible variations $\delta q$ we call $q$ a stationary point or extremal of functional $S$.

Remark 3.

We consider $q$ as a point in the functional space.
In this definition we did not specify which variations are admissible. Let us consider as admissible all variations which are $0$ at both ends of $I$: \begin{equation} \delta q (t_0)=\delta q(t_1)=0. \label{eq-10.1.7} \end{equation} We will consider different admissible variations later.

In this framework \begin{equation} \delta S= \int_I\Bigl(\frac{\partial L}{\partial q} - \frac{d}{dt} \frac{\partial L}{\partial \dot{q}} \Bigr)\delta q \,dt . \label{eq-10.1.8} \end{equation}

Lemma 1. Let $f$ be a continuous function in $I$. If $\int_I f(t)\varphi(t)\,dt=0$ for all $\varphi$ such that $\varphi(t_0)=\varphi(t_1)=0$ then $f=0$ in $I$.

Proof. Indeed, let us assume that $f(\bar{t})> 0$ at some point $\bar{t}\in I$ (case $f(\bar{t})< 0$ is analyzed in the same way). Then $f(t)>0$ in some vicinity $\mathcal{V}$ of $\bar{t}$. Consider function $\varphi(x)$ which is $0$ outside of $\mathcal{V}$, $\varphi\ge 0$ in $\mathcal{V}$ and $\varphi(\bar{t})>0$. Then $f(t)\varphi(t)$ has the same properties and $\int_I f(t)\varphi(t)\, dt>0$. Contradiction!

As a corollary we arrive to

Theorem 2. Let us consider a functional (\ref{eq-10.1.4}) and consider as admissible all $\delta u$ satisfying (\ref{eq-10.1.7}). Then $u$ is a stationary point of $\Phi$ if and only if it satisfies Euler-Lagrange equation \begin{equation} \frac{\delta S}{\delta q}:= \frac{\partial L}{\partial q} - \frac{d}{dt} \left(\frac{\partial L}{\partial \dot{q}}\right) =0. \label{eq-10.1.9} \end{equation}

Remark 4.

Equation (\ref{eq-10.1.9}) is the 2nd order ODE. Indeed, \begin{gather*} \frac{d}{dt} \frac{\partial L}{\partial \dot{q}}= \left(\frac{\partial ^2L}{\partial \dot{q}\partial t} +\frac{\partial ^2L}{\partial \dot{q}\partial q}\dot{q} + \frac{\partial ^2L}{\partial \dot{q}^2}\ddot{q}\right). \end{gather*}
If $L_{q}=0$ then it is integrates to \begin{equation} \frac{\partial L}{\partial \dot{q}}=C. \label{eq-10.1.10} \end{equation}
The following equality holds: \begin{equation} \frac{d}{dt} \left(\frac{\partial L}{\partial \dot{q}}\dot{q}-L\right)=-\frac{\partial L}{\partial t}. \label{eq-10.1.11} \end{equation} The proof will be provided for vector-valued $\mathbf{q}(t)$.
In particular, if $\frac{\partial L}{\partial t}=0$ ($L$ does not depend explicitly on $t$), then \begin{equation} \frac{d}{dt} \left(\frac{\partial L}{\partial \dot{q}}\dot{q}-L\right)=0\implies \frac{\partial L}{\partial \dot{q}}\dot{q}-L=C. \label{eq-10.1.12} \end{equation}

Extremums of functionals

Definition 6. If $S[q]\ge S[q+\delta q]$ for all small admissible variations $\delta q$ we call $q$ a local maximum of functional $S$. If $S[q]\le S[q+\delta q]$ for all small admissible variations $\delta q$ we call $q$ a local minimum of functional $S$.

Here again we do not specify what is small admissible variation.

Theorem 3. If $q$ is a local extremum (that means either local minimum or maximum) of $S$ and variation exists, then $q$ is a stationary point.

Proof. Consider case of minimum. Let $\delta q =\varepsilon \varphi$. Then $S[q+\delta q]- S [q]=\varepsilon (\delta S)(\varphi) +o(\varepsilon)$. If $\pm \delta S > 0$ then choosing $\mp \varepsilon < 0$ we make $\varepsilon (\delta S)(\varphi)\le -2\sigma \varepsilon$ with some $\sigma>0$. Meanwhile for sufficiently small $\varepsilon$ "$o(\varepsilon)$" is much smaller and $S [q+\delta q]- S [q]\le -\sigma \varepsilon<0$ and $q$ is not a local minimum.

Remark 5.

We used that $\varepsilon$ can take both signs. If $\varepsilon $ can be only positive or negative, then conclusion is wrong.
We consider neither sufficient conditions of extremums nor second variations (similar to second differentials). In some cases they will be obvious.

Example 2. Find the line of the fastest descent from fixed point $A$ to fixed point $B$.

Solution. Assuming that the speed at $A(0,0)$ is $0$, we conclude that the speed at point $(x,y)$ is $\sqrt{2gy}$ and therefore the time of the descent from $A$ to $B(a, -h)$ \begin{gather*} T[y]=\int_0^a \frac {\sqrt{1+y'^2}\,dx}{\sqrt{2gy}}. \end{gather*}

Since $L(y,y'):= \dfrac {\sqrt{1+y'^2}}{\sqrt{2gy}}$ does not depend on $x$ we use (\ref{eq-10.1.12}) \begin{multline*} \frac{\partial L}{\partial y'} y' - L =\frac {1}{\sqrt{1+y'^2}\sqrt{2gy}}=C \implies y'= \sqrt{\frac{D-y}{y}}\implies dx= \sqrt{\frac{y}{D-y}}\,dy \end{multline*} with constants $C$ and $D$.

This equation has a solution \begin{align*} x= &r\bigl(\varphi - \sin(\varphi)\bigr),\\ y=&r\bigl(1-\cos(\varphi)) \end{align*} with $r=D/2$.
So, brachistochrone is an inverted cycloid-the curve traced by a point on a circle as it rolls along a straight line without slipping.

$\Leftarrow$ $\Uparrow$ $\downarrow$ $\Rightarrow$