JustToThePoint English Website Version
JustToThePoint en español

Differentiation on Euclidean Space

For every problem there is always, at least, a solution which seems quite plausible. It is simple and clean, direct, neat and nice, and yet very wrong, #Anawim, justtothepoint.com

image info

Differentiation on Euclidean Space

From Single-Variable to Multivariable Derivatives: Core Analogy

In single-variable calculus, the derivative $f'(a)$ is the slope of the tangent line at x = a, satisfying $f(a + h) - f(a) = f′(a)h + o(∣h∣)$ as $h \to 0$.

In multivariable calculus ($F: \mathbb{R}^n \to \mathbb{R}^m$), the concept generalizes: the derivative is not a single number, but a linear transformation $DF(a): \mathbb{R}^n \to \mathbb{R}^m$ such that $F(a + h) - F(a) = DF(a)h + o(∣|h|∣)$ as $h \to 0$.

This linear map DF(a) is the best linear approximation* of F near a.

Euclidean Space ℝⁿ

Definition. The n-dimensional Euclidean space ℝⁿ is the set of all ordered n-tuples: $\mathbb{R}^n = \{(x_1, x_2, \ldots, x_n) : x_i \in \mathbb{R}\}$ equipped with:

Definition. The standard basis for ℝⁿ consists of the vectors: $e_1 = (1, 0, 0, \ldots, 0), \quad e_2 = (0, 1, 0, \ldots, 0), \quad \ldots, \quad e_n = (0, 0, \ldots, 0, 1)$

More precisely: $(e_j)_i = \delta_{ij} = \begin{cases} 1 & \text{if } i = j \\ 0 & \text{if } i \neq j \end{cases}$

The standard basis $\{ e_j \}_{j=1}^n$ spans $\mathbb{R}^n$. Any vector can be written as: $x = (x_1, x_2, \ldots, x_n) = x_1 e_1 + x_2 e_2 + \cdots + x_n e_n = \sum_{j=1}^{n} x_j e_j$. x has coordinates $(x_1, x_2, \ldots, x_n)$ relative to this bases.

Linear Maps

Definition. A function L: ℝⁿ → ℝᵐ is linear if:

  1. $L(x + y) = L(x) + L(y)$ for all $x, y \in \mathbb{R}^n$
  2. $L(cx) = cL(x)$ for all $x \in \mathbb{R}^n$, $c \in \mathbb{R}$

Equivalently: $L(\alpha x + \beta y) = \alpha L(x) + \beta L(y)$ for all scalars α, β.

Key Property: A linear map is completely determined by its action on basis vectors: $L(x) = L\left(\sum_{j=1}^{n} x_j e_j\right) = \sum_{j=1}^{n} x_j L(e_j)$

Every linear map L: ℝⁿ → ℝᵐ can be represented by an m × n matrix A: $L(x) = Ax$

The columns of A are the images of the standard basis vectors:

$$A = \begin{pmatrix} | & | & & | \\ L(e_1) & L(e_2) & \cdots & L(e_n) \\ | & | & & | \end{pmatrix}$$

Example: The linear map L: ℝ² → ℝ³ with L(e₁) = (1, 2, 3)ᵀ and L(e₂) = (4, 5, 6)ᵀ has matrix:

$$A = \begin{pmatrix} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{pmatrix}$$

The Fréchet Derivative: Best Linear Approximation

For a function F: U ⊆ ℝⁿ → ℝᵐ, we want to approximate F near a point a using the simplest possible function: a linear map.

The derivative is not a number or a vector —it’s a linear transformation that captures the “first-order” behavior of F at a .

The approximation takes the form: $F(a + h) \approx F(a) + L(h)$ where L: $\mathbb{R}^n \to \mathbb{R}^m$ is linear and the error vanishes faster than ∥h∥. The derivative is the best such approximation.

Definition (Fréchet Derivative). Let F: U ⊆ ℝⁿ → ℝᵐ where U is open, and let a ∈ U. We say F is differentiable at a if there exists a linear map L: ℝⁿ → ℝᵐ such that:

$$\lim_{h \to 0} \frac{\|F(a + h) - F(a) - L(h)\|}{\|h\|} = 0$$

Equivalent Formulations

The definition can be rewritten as: $F(a + h) = F(a) + L(h) + o(\|h\|) \quad \text{as } h \to 0$ where $o(\|h\|)$ denotes a term with $\lim_{h \to 0} \frac{o(\|h\|)}{\|h\|} = 0$.

Or more explicitly: $F(a + h) = F(a) + L(h) + \|h\| \cdot \varepsilon(h)$ where $\varepsilon(h) \to 0$ as $h \to 0$.

Theorem (Uniqueness). If F is differentiable at a, the derivative $DF_a$ is unique.

Proof. Suppose L₁ and L₂ both satisfy the definition. Then, $\frac{\|L_1(h) - L_2(h)\|}{\|h\|} = \frac{\|[F(a+h) - F(a) - L_2(h)] - [F(a+h) - F(a) - L_1(h)]\|}{\|h\|}$

By the triangle inequality, $\leq \frac{\|F(a+h) - F(a) - L_1(h)\|}{\|h\|} + \frac{\|F(a+h) - F(a) - L_2(h)\|}{\|h\|}$

Both terms on the right tend to 0 as $h \to 0$. Hence, $\lim_{h \to 0} \frac{\|L_1(h) - L_2(h)\|}{\|h\|} = 0 (\star).$

Because $L_1$ and $L_2$ are linear, for any fixed non-zero vector u and any real $t \ne 0$, $L_i(tu) = tL_i(u)$.

Next, take h = tu with $t \to 0$. Then, $\|h\| = |t|\|u\|$ and $\frac{\|L_1(tu) - L_2(tu)\|}{|t|\|u\|} = \frac{|t| \|L_1(u) - L_2(u)\|}{|t|\|u\|} = \frac{\|L_1(u) - L_2(u)\|}{\|u\|}$

This expression does not depend on t. Since the limit as $f \to 0$ must be 0 $(\star)$, we obtain $\frac{\|L_1(u) - L_2(u)\|}{\|u\|} = 0 \implies L_1(u) = L_2(u)$

The equality holds for every non‑zero vector u; for u = 0 it is trivial because linear maps send 0 to 0. Therefore, $L_1$ and $L_2$ agree on the whole space, i.e., $L_1 = L_2$ ∎

Why this matters? Uniqueness guarantees that the derivative is well‑defined; otherwise the notation $DF_a$ would be ambiguous. Furthermore, the derivative is the best linear approximation.

Partial Derivatives

While the total derivative $DF_a$ captures how F changes in all directions simultaneously, partial derivatives measure rates of change along coordinate axes only.

Definition. Let F: U ⊆ ℝⁿ → ℝᵐ be a function and a ∈ U. The partial derivative of F with respect to $x_j$ at a is: $\frac{\partial F}{\partial x_j}(a) = \lim_{t \to 0} \frac{F(a + te_j) - F(a)}{t}$ where

The partial derivative $\frac{\partial F}{\partial x_j}(a)$ is the rate of change of F when we move from a in the $e_j$ direction, keeping all other coordinates fixed.

Example: F: ℝ² → ℝ³ defined by F(s, t) = (s² + t³, 2st, s + 3t) where $F_1(s,t) = s^2 + t^3$, $F_2(s,t) = 2st$, and $F_3(s,t) = s + 3t$

Partial with respect to s: $\frac{\partial F}{\partial s} = \begin{pmatrix} \frac{\partial F_1}{\partial s} \\[6pt] \frac{\partial F_2}{\partial s} \\[6pt] \frac{\partial F_3}{\partial s} \end{pmatrix} = \begin{pmatrix} 2s \\ 2t \\ 1 \end{pmatrix}$

Partial with respect to t: $\frac{\partial F}{\partial t} = \begin{pmatrix} \frac{\partial F_1}{\partial t} \\[6pt] \frac{\partial F_2}{\partial t} \\[6pt] \frac{\partial F_3}{\partial t} \end{pmatrix} = \begin{pmatrix} 3t^2 \\ 2s \\ 3 \end{pmatrix}$

The Jacobian Matrix

Since $DF_a$ is a linear map from $\mathbb{R}^n$ to $\mathbb{R}^m$, it can be represented by an $m \times n$ matrix. This is called the Jacobian Matrix.

Definition. Let F: U ⊆ ℝⁿ → ℝᵐ with component functions $F = (F_1, F_2, \ldots, F_m)$. The Jacobian matrix of F at a point a ∈ U is the m × n matrix formed by all partial derivatives evaluated at x = a:

$$J_F(a) = \begin{pmatrix} \frac{\partial F_1}{\partial x_1}(a) & \frac{\partial F_1}{\partial x_2}(a) & \cdots & \frac{\partial F_1}{\partial x_n}(a) \\[8pt] \frac{\partial F_2}{\partial x_1}(a) & \frac{\partial F_2}{\partial x_2}(a) & \cdots & \frac{\partial F_2}{\partial x_n}(a) \\[8pt] \vdots & \vdots & \ddots & \vdots \\[8pt] \frac{\partial F_m}{\partial x_1}(a) & \frac{\partial F_m}{\partial x_2}(a) & \cdots & \frac{\partial F_m}{\partial x_n}(a) \end{pmatrix}$$

Two Ways to View the Jacobian

Row View. The rows of the Jacobian are the transposes of the gradients of the component functions: $J_F(a) = \begin{pmatrix} — (\nabla F_1(a))^T — \\ — (\nabla F_2(a))^T — \\ \vdots \\ — (\nabla F_m(a))^T — \end{pmatrix}$

Column View: Each column is a partial derivative vector:

$$J_F = \begin{pmatrix} | & | & & | \\[4pt] \frac{\partial F}{\partial x_1} & \frac{\partial F}{\partial x_2} & \cdots & \frac{\partial F}{\partial x_n} \\[4pt] | & | & & | \end{pmatrix}$$

Computing the Differential

To find the actual change vector $DF_a(h)$ for a specific displacement $h$, we perform matrix multiplication: $dF_a(h) = J_F(a) \cdot \begin{pmatrix} h_1 \\ \vdots \\ h_n \end{pmatrix}$

The differential dFₐ is the best linear approximation of the change in F near the point a. The Jacobian matrix JF(a) is the matrix that represents this linear transformation. When you multiply the Jacobian matrix by the vector h, you get the approximate change in F corresponding to the small change h.

Examples:

  1. Vector-Valued Function (2D to 3D). Let $F: \mathbb{R}^2 \to \mathbb{R}^3$ be defined by $F(s, t) = (s^2+t^3, 2st, s + 3t)$.
    Partials with respect to s: $(2s, 2t, 1)^T$
    Partials with respect to t: $(3t^2, 2s, 3)^T$
    $J_F(s,t) = \begin{pmatrix} 2s & 3t^2 \\ 2t & 2s \\ 1 & 3 \end{pmatrix}$

Complex Derivatives vs. Real Derivatives

We can think of a complex number z = x + iy as a point $(x,y) \in \mathbb{R^{\mathnormal{2}}}$. So any function $f:\mathbb{C}\rightarrow \mathbb{C}$ can also be seen as a function $F:\mathbb{R^{\mathnormal{2}}}\rightarrow \mathbb{R^{\mathnormal{2}}},\quad F(x,y)=(u(x,y),v(x,y)),$ where f(x + iy) = u(x, y) + iv(x, y).

So there are two notions of differentiability:

They are related, but not the same. Complex differentiability is much more restrictive.

Every complex number $a=\alpha +i\beta$ defines a real-linear map $\mathbb{R^{\mathnormal{2}}}\rightarrow \mathbb{R^{\mathnormal{2}}}$ via multiplication: $a(x+iy)=(\alpha x-\beta y)+i(\beta x+\alpha y).$

In matrix form, this is: $\left( \begin{matrix}u\\ v\end{matrix}\right) =\left( \begin{matrix}\alpha &-\beta \\ \beta &\alpha \end{matrix}\right) \left( \begin{matrix}x\\ y\end{matrix}\right).$

So complex multiplication corresponds exactly to matrices of the form $\left( \begin{matrix}a&-b\\ b&a\end{matrix}\right)$.

These are precisely the matrices that represent rotation + uniform scaling (no reflection, no shear).

However, a general real derivative $DF(x_0,y_0)$ is an arbitrary matrix $\left( \begin{matrix}A&B\\ C&D\end{matrix}\right)$, with no relation between A, B, C, and D.

For complex differentiability, we require that this matrix comes from a single complex number, i.e. $\left( \begin{matrix}A&B\\ C&D\end{matrix}\right) =\left( \begin{matrix}a&-b\\ b&a\end{matrix}\right)$ for some real a, b. That forces: $A=D,\quad C=-B$.

These are exactly the Cauchy–Riemann equations in disguise.

Write f(z) = u(x, y) + iv(x, y), with z = x + iy. If f is complex differentiable at $z_0$, then:

  1. u and v are real-differentiable at $(x_0,y_0)$.
  2. The Jacobian matrix $J_F(x_0,y_0)=\left( \begin{matrix}u_x&u_y\\ v_x&v_y\end{matrix}\right)$ must be of the special form $\left( \begin{matrix}a&-b\\ b&a\end{matrix}\right).$

Matching entries gives: $u_x=v_y,\quad u_y=-v_x.$ These are the Cauchy–Riemann equations. They are exactly the condition that the real derivative is not just any linear map, but one that comes from complex multiplication.

Conclusion:

Bitcoin donation

JustToThePoint Copyright © 2011 - 2026 Anawim. ALL RIGHTS RESERVED. Bilingual e-books, articles, and videos to help your child and your entire family succeed, develop a healthy lifestyle, and have a lot of fun. Social Issues, Join us.

This website uses cookies to improve your navigation experience.
By continuing, you are consenting to our use of cookies, in accordance with our Cookies Policy and Website Terms and Conditions of use.