$\renewcommand{\Re}{\operatorname{Re}}$ $\renewcommand{\Im}{\operatorname{Im}}$ $\newcommand{\erf}{\operatorname{erf}}$ $\newcommand{\dag}{\dagger}$ $\newcommand{\const}{\mathrm{const}}$ $\newcommand{\arcsinh}{\operatorname{arcsinh}}$
Definition 1. Functional is a map from some space of functions (or subset in the space of functions) $\mathsf{H}$ to $\mathbb{R}$ (or $\mathbb{C}$): \begin{equation} \Phi: \mathsf{H}\ni u\to \Phi[u]\in \mathbb{R}. \label{eq-10.1.1} \end{equation}
Remark 1. Important that we consider a whole function as an argument, not its value at some particular point!
Example 1.
Definition 2.
Definition 3. Functional $\Phi[u]$ is called linear if \begin{gather} \Phi [u+v]= \Phi[u]+\Phi [v],\label{eq-10.1.2}\\ \Phi [\lambda u]= \lambda \Phi[u] \label{eq-10.1.3} \end{gather} for all functions $u$ and scalars $\lambda$.
Remark 2. Linear functionals will be crucial in the definition of distributions later.
Exercise 1. Which functionals of Example 1. are linear?
Let us consider functional \begin{equation} \Phi[u]= \iiint_\Omega L(x, u,\nabla u)\,dx \label{eq-10.1.4} \end{equation} where $\Omega$ is $n$-dimensional domain and $L$ is some function of $n+2$ variables. Let us consider $u+\delta u$ where $\delta u $ is a "small" function. We do not formalize this notion, just $\varepsilon \phi$ with fixed $\phi$ and $\varepsilon\to 0$ is considered to be small. We call $\delta u$ variation of $u$ and important is that we change a function as a whole object. Let us consider \begin{multline} \Phi[u+\delta u]-\Phi[u]= \iiint_\Omega \Bigl(L(x,u+\delta u,\nabla u +\nabla \delta u)-L(x, u,\nabla u) \Bigr)\,dx\\ \approx \iiint_\Omega \Bigl(\frac{\partial L}{\partial u}\delta u +\sum_{1\le j\le n} \frac{\partial L}{\partial u_{x_j}}\delta u_{x_j} \Bigr)\,dx\qquad \label{eq-10.1.5} \end{multline} where we calculated the linear part of expression in the parenthesis; if $\delta u=\varepsilon \phi$ and all functions are sufficiently smooth then $\approx$ would mean "equal modulo $o(\varepsilon)$ as $\varepsilon\to 0$".
Definition 4.
Assumption 1. All functions are sufficiently smooth.
Under this assumption, we can integrate the right-hand expression of (\ref{eq-10.1.5}) by parts: \begin{multline} \delta \Phi:= \iiint_\Omega \Bigl(\frac{\partial L}{\partial u}\delta u +\sum_{1\le j\le n} \frac{\partial L}{\partial u_{x_j}}\delta u_{x_j} \Bigr)\,dx\\ = \iiint_\Omega\Bigl(\frac{\partial L}{\partial u} - \sum_{1\le j\le n} \frac{\partial\ }{\partial x_j} \frac{\partial L}{\partial u_{x_j}} u\Bigr)\delta u \,dx - \iint_{\partial \Omega} \Bigl(\sum_{1\le j\le n} \frac{\partial L}{\partial u_{x_j}}\nu_j \Bigr)\delta u \,d\sigma\qquad \label{eq-10.1.6} \end{multline} where $d\sigma$ is an area element and $\nu$ is a unit interior normal to $\partial \Omega$.
Definition 5. If $\delta \Phi=0$ for all admissible variations $\delta u$ we call $u$ a stationary point or extremal of functional $\Phi$.
Remark 3.
In this framework \begin{equation} \delta \Phi= \iiint_\Omega\Bigl(\frac{\partial L}{\partial u} - \sum_{1\le j\le n} \frac{\partial\ }{\partial x_j} \frac{\partial L}{\partial u_{x_j}} \Bigr)\delta u \,dx . \label{eq-10.1.8} \end{equation}
Lemma 1. Let $f$ be a continuos function in $\Omega$. If $\iiint_\Omega f(x)\phi(x)\,dx=0$ for all $\phi$ such that $\phi|_{\partial \Omega}=0$ then $f=0$ in $\Omega$.
Proof. Indeed, let us assume that $f(\bar{x})> 0$ at some point $\bar{x}\in \Omega$ (case $f(\bar{x})< 0$ is analyzed in the same way). Then $f(x)>0$ in some vicinity $\mathcal{V}$ of $\bar{x}$. Consider function $\phi(x)$ which is $0$ outside of $\mathcal{V}$, $\phi\ge 0$ in $\mathcal{V}$ and $\phi(\bar{x})>0$. Then $f(x)\phi(x)$ has the same properties and $\iiint_{\Omega} f(x)\phi(x)\, dx>0$. Contradiction!
As a corollary we arrive to
Theorem 1. Let us consider a functional (\ref{eq-10.1.4}) and consider as admissible all $\delta u$ satisfying (\ref{eq-10.1.7}). Then $u$ is a stationary point of $\Phi$ if and only if it satisfies Euler-Lagrange equation \begin{equation} \frac{\delta \Phi}{\delta u}:= \frac{\partial L}{\partial u} - \sum_{1\le j\le n} \frac{\partial\ }{\partial x_j} \left(\frac{\partial L}{\partial u_{x_j}}\right) =0. \label{eq-10.1.9} \end{equation}
Definition 6. If $\Phi[u]\ge \Phi[u+\delta u]$ for all small admissible variations $\delta u$ we call $u$ a local maximum of functional $\Phi$. If $\Phi[u]\le \Phi[u+\delta u]$ for all small admissible variations $\delta u$ we call $u$ a local minimum of functional $\Phi$.
Here again we do not specify what is small admissible variation.
Theorem 2. If $u$ is a local extremum (that means either local minimum or maximum) of $\Phi$ and variation exits, then $u$ is a stationary point.
Proof. Consider case of minimum. Let $\delta u =\varepsilon \phi$. Then $\Phi [u+\delta u]- \Phi [u]=\varepsilon (\delta \Phi)(\phi) +o(\varepsilon)$. If $\pm \delta \Phi> 0$ then choosing $\mp \varepsilon <0$ we make $\varepsilon (\delta \Phi)(\phi)\le -2\epsilon_0 \varepsilon$ with some $\epsilon_0>0$. Meanwhile for sufficiently small $\varepsilon$ "$o(\varepsilon)$" is much smaller and $\Phi [u+\delta u]- \Phi [u]\le -2\epsilon_0 \varepsilon<0$ and $u$ is not a local minimum.
Remark 4. We consider neither sufficient conditions of extremums nor second variations (similar to second differentials). In some cases they will be obvious.
Example 2.
Then Euler-Lagrange equation is \begin{equation} -\frac{\partial\ }{\partial x} \Bigl(u_x\bigl(1+u_x^2+u_y^2\bigr)^{-\frac{1}{2}}\Bigr)- \frac{\partial\ }{\partial y} \Bigl(u_y\bigl(1+u_x^2+u_y^2\bigr)^{-\frac{1}{2}}\Bigr)=0. \label{eq-10.1.11} \end{equation} b. Assuming that $u_x, u_y \ll 1$ one can approximate $A(\Sigma)-A(\Omega)$ by \begin{equation} \frac{1}{2}\iint_{\Omega} \bigl(u_x^2+u_y^2\bigr)\,dxdy \label{eq-10.1.12} \end{equation} and for this functional Euler-Lagrange equation is \begin{equation} -\Delta u=0. \label{eq-10.1.13} \end{equation} c. Both (a) and (b) could be generalized to higher dimensions.
Remark 5. Both equations (\ref{eq-10.1.11}) and (\ref{eq-10.1.12}) come with the boundary condition $u|_{\partial\Omega}=g$. In the next section we analyse the case when such condition is done in the original variational problem only on the part of the boundary.