All things are difficult before they are easy, Thomas Fuller.

image info

Differentiability at a point

Definition. Let F: U ⊆ ℝⁿ → ℝᵐ where U is open, and let a ∈ U. The function F is differentiable at a if there exists a matrix $DF(a) \in \mathbb{R}^{m \times n}$ such that $\lim_{h \to 0} \frac{\|F(a + h) - F(a) - DF(a) \cdot h\|}{\|h\|} = 0$

The matrix DF(a) is called the derivative, differential, or Jacobian matrix of F at a. It captures the idea that a function can be locally approximated by a linear transformation.

image info

Figure. For f(x,y)=x²+y², the red plane at (1,1) is the Jacobian’s linear approximation.

Visual Example: f(x,y) = x² + y²

For the quadratic function f(x, y) = x² + y², visualize the surface as an elliptic paraboloid opening upward, symmetric about the z-axis. The visualization above demonstrates several key aspects of multivariable differentiability:

Local Linear Approximation: At the point (x₀, y₀) = (1, 1), f(1, 1) = 2, and the red tangent plane represents the Jacobian matrix’s action as a linear map. This plane provides the best linear approximation to the function near this point. Geometrically, it represents the flattest surface that locally matches the curvature of the paraboloid.
Jacobian Matrix: For our example f(x, y) = x² + y², $\nabla f = (\frac{\partial f}{\partial x} \frac{\partial f}{\partial y}) = (2x, 2y)$. At (1, 1): $\nabla f(1, 1)$ = (2·1 2·1) = (2 2). So the Jacobian matrix at (1, 1) is $Df(1, 1) = (2 2)$: This matrix defines the linear transformation that maps small changes $(h_1, h_2)$ in x and y to changes in f(x, y). It encodes the directional slopes of the surface at (1, 1).
Approximation Property: The formula f(x + h) = f(x) + Df(x)h + o(||h||) is precisely what we’re seeing visually. The tangent plane (defined by Df(x)·h) approximates the actual surface (f(x+h)) with increasing accuracy as we approach the point of tangency.

The linear approximation formula: $f(1 + h_1, 1 + h_2) \approx f(1, 1) + \nabla f(1, 1) \cdot (h_1, h_2) = 2 + 2h_1 + 2h_2$

For any differentiable function f(x, y), the tangent plane at $(x_0, y_0)$ is given by $z = f(x_0, y_0) + \frac{\partial f}{\partial x}|_{(x_0, y_0)}(x -x_0) + \frac{\partial f}{\partial y}|_{(x_0, y_0)}(y - y_0)$. This equation represents the linear approximation of f near $(x_0, y_0)$.

The tangent plane equation is: z = 2 + 2(x-1) + 2(y-1) = 2x + 2y - 2. The term 2(x−1) captures the slope of the surface in the x-direction at (1,1). The term 2(y−1) captures the slope in the y-direction.

The tangent plane “hugs” the paraboloid tightly near (1, 1). As you zoom in, the surface becomes indistinguishable from the plane.

$f(1 + h_1, 1 + h_2) = (1 + h_1)^2 + (1 + h_2)^2 = \underbrace{2+2h_1+2h_2}_\text{Linear approx.} + \underbrace{h_1^2+h_2^2}_\text{Error}$.

The error $ h_1^2 + h_2^2 = O(\|h\|^2) $ vanishes faster than $\|h\|$ as $ h \to 0 $, confirming differentiability. It represents the quadratic curvature that the linear approximation intentionally ignores.

The Jacobian Matrix. If F = (F₁, F₂, …, Fₘ): U ⊆ ℝⁿ → ℝᵐ is differentiable at a, the Jacobian matrix is: $(DF(a))_{ij} = \frac{\partial F_i}{\partial x_j}(a)$

$$DF(a) = J_F(a) = \begin{pmatrix} \frac{\partial F_1}{\partial x_1}(a) & \frac{\partial F_1}{\partial x_2}(a) & \cdots & \frac{\partial F_1}{\partial x_n}(a) \\[8pt] \frac{\partial F_2}{\partial x_1}(a) & \frac{\partial F_2}{\partial x_2}(a) & \cdots & \frac{\partial F_2}{\partial x_n}(a) \\[8pt] \vdots & \vdots & \ddots & \vdots \\[8pt] \frac{\partial F_m}{\partial x_1}(a) & \frac{\partial F_m}{\partial x_2}(a) & \cdots & \frac{\partial F_m}{\partial x_n}(a) \end{pmatrix}$$

Definition. A function F: U → ℝᵐ is differentiable (or differentiable on U) if:

U is an open subset of ℝⁿ
F is differentiable at every point a ∈ U

Definition. F is continuously differentiable (written $F \in C^1$) if:

F is differentiable on U
All partial derivatives are continuous on U

More generally, $F \in C^k$ if all partial derivatives up to order k exist and are continuous. In particular, $F \in C^\infty$ (smooth) if it has continuous partial derivatives of all orders.

Important Remarks

Each column is a partial derivative vector: $DF = \begin{pmatrix} | & | & & | \\[4pt] \frac{\partial F}{\partial x_1} & \frac{\partial F}{\partial x_2} & \cdots & \frac{\partial F}{\partial x_n} \\[4pt] | & | & & | \end{pmatrix}$
Each row is the gradient of a component: $DF = \begin{pmatrix} — & \nabla F_1^T & — \\ — & \nabla F_2^T & — \\ & \vdots & \\ — & \nabla F_m^T & — \end{pmatrix}$.
Uniqueness of the Jacobian. If F is differentiable at a, the Jacobian DF(a) is unique.
Differentiability Implies Continuity. If F is differentiable at a, then F is continuous at a.
The Standard Theorem (Sufficient Condition). Let U ⊆ ℝⁿ be open and F: U → ℝᵐ. If all partial derivatives $\frac{\partial F_i}{\partial x_j}$ exist and are continuous in a neighborhood of a, then F is differentiable at a.
Theorem (Chain Rule). Let F: U ⊆ ℝⁿ → ℝᵐ and G: V ⊆ ℝᵐ → ℝᵖ with F(U) ⊆ V. If F is differentiable at a and G is differentiable at F(a), then G ∘ F is differentiable at a with: $D(G \circ F)(a) = DG(F(a)) \cdot DF(a)$

Quick Jacobian Recipes

Function Type	F(x)	Jacobian DF(x)
Identity	x	I
Linear	Ax	A
Affine	Ax + b	A
Quadratic	xᵀAx	xᵀ(A + Aᵀ)
Squared norm	∥x∥²	2xᵀ
Component-wise	(f₁(x), …, fₘ(x))	(∇f₁ᵀ; …; ∇fₘᵀ)

Differentiable function

Definition. Let $f: \text{dom}(f) \to \mathbb{R}^m$ be a function, where its domain $\text{dom}(f) \subseteq \mathbb{R}^n$ is an open set. The function $f$ is said to be differentiable if it is differentiable at every point $\mathbf{x} \in \text{dom}(f)$.

Formally, $f$ is differentiable at a point $\mathbf{x}$ if there exists a linear map $L: \mathbb{R}^n \to \mathbb{R}^m$ (represented by an $m \times n$ matrix, the Jacobian matrix $Df(\mathbf{x})$), such that $\lim_{\mathbf{h} \to \mathbf{0}, \mathbf{h} \in \mathbb{R}^n} \frac{\| f(\mathbf{x} + \mathbf{h}) - f(\mathbf{x}) - Df(\mathbf{x})\mathbf{h} \|}{\|\mathbf{h}\|} = 0$

Here, $\| \cdot \|$ denotes a norm (e.g., the Euclidean norm) on $\mathbb{R}^n$ and $\mathbb{R}^m$. This limit must hold for sequences $\mathbf{h} \to 0$ from any direction within $\mathbb{R}^n$.

Key Conditions and important notes

Open Domain. The requirement that the domain $\text{dom}(f)$ is open is fundamental. An open set means that for every point $\mathbf{x}$ in the domain, there exists a small open ball (or neighborhood) around $\mathbf{x}$ that is entirely contained within $\text{dom}(f)$. This condition is essential because the definition of the derivative involves a limit as $\mathbf{h} \to \mathbf{0}$.
To ensure that $f(\mathbf{x} + \mathbf{h})$ is defined for all sufficiently small perturbations $\mathbf{h}$, the point $\mathbf{x}$ must not lie on the boundary of the domain. This allows us to approach $\mathbf{x}$ from all possible directions within $\mathbb{R}^n$.
Pointwise Differentiability The function must be differentiable at every point in At each point $\mathbf{x}$ in its domain, the function must be locally approximable by a linear map. The Jacobian matrix $Df(\mathbf{x})$, which contains all first-order partial derivatives of $f$, represents this best linear approximation.
The limit condition ensures that the error of this linear approximation, $f(\mathbf{x} + \mathbf{h}) - f(\mathbf{x}) - Df(\mathbf{x})\mathbf{h}$, shrinks to zero faster than $||\mathbf{h}||$ does. This is a much stronger statement than just saying all partial derivatives exist. In other words, the best linear approximation to the function exists at each point and behaves well locally.
Geometric Intuition. Recall that a function is differentiable if it is “smooth” and has no abrupt changes. In the one-variable case, this means the graph has no corners (like the absolute value function $f(x)=|x|$ at $x=0$), cusps (a point on a curve where the graph sharply changes direction, like $f(x)=x^{2/3}$ at $x=0$), vertical tangents (i.e., the slope is undefined like $f(x)=\sqrt[3]{x}$ at $x=0$) or discontinuities within its domain.
In higher dimensions, the intuition is similar: the function’s graph is a surface that, when you zoom in close enough on any point, begins to look flat, like a plane The function can be locally approximated by a linear map (i.e., its Jacobian matrix).
Differentiability implies continuity, but not vice versa. The existence of a good linear approximation prevents any jumps.
A function may be differentiable on an open set but may not be differentiable at points on the boundary of that set. This is precisely because the limit definition cannot be checked on the boundary, where approaching from all directions might leave the domain.
Sufficient condition for differentiability ($C^1$). If all first-order partial derivatives of f exist and are continuous on an open set, then f is continuously differentiable on that set, often denoted as $f \in C^1(U)$.
If the partial derivatives exist and are continuous, it implies that the rate of change of the function with respect to each variable is well-behaved and doesn’t have sudden jumps or discontinuities. The continuity of partial derivatives ensures that the linear approximation is well-defined and consistent across the open set. This is what makes the function continuously differentiable.

Examples of Differentiable Functions

Let f(x, y) = sin(x) + y². Domain: ℝ², which is open.

All partial derivatives exist and are continuous everywhere: $\frac{\partial f}{\partial x} = \cos(x)$ is continuous on ℝ² because cosine is continuous everywhere. $\frac{\partial f}{\partial y} = 2y$ is continuos on ℝ² as it is a linear function, hence both partial derivatives are continuous on ℝ². By the Standard Theorem, f is differentiable everywhere. Jacobian: $Df(x, y) = (\cos x, 2y)$

Geometric intuition. In the x-direction, the surface z = sin(x) + y² undulates like a sine wave. In the y-direction, it opens upward parabolically. Because these two behaviors blend smoothly (no corners or cusps), the surface admits a well-defined tangent plane at every point.

Polynomial in Two Variables. p(x,y) = x³ −3xy + y². Domain: ℝ². Partials: $\dfrac{\partial p}{\partial x} = 3x^2 -3y, \dfrac{\partial p}{\partial y} = -3x + 2y$. Both are polynomials, hence continuous everywhere. Therefore $p \in C^{\infty}$ (infinitely differentiable).
Exponential-Polynomial Mixture. $f(x, y)=e^{x²+y²}+xy$. Domain: ℝ². f(x, y) is C¹ because the exponential is C¹, x²+y² is C¹, and xy is C¹, too. Exponential and polynomial factors are infinitely differentiable, their sums and products remain smooth ⇒ f is differentiable on ℝ².
Partial derivatives: $\frac{\partial f}{\partial x} = 2x \cdot e^{x^2+y^2} + y, \quad \frac{\partial f}{\partial y} = 2y \cdot e^{x^2+y^2} + x$. Both partials are compositions and products of smooth functions, hence continuous everywhere, $f \in C^{\infty}$.
Rational function away from the origin $f(x, y)=\dfrac{x}{x^{2}+y^{2}+1}$. Denominator x²+y²+1 never vanishes, f is well-defined on all of ℝ², domain = ℝ² (open).

Partials: $\frac{\partial f}{\partial x} = \frac{(x^2+y^2+1) - x \cdot 2x}{(x^2+y^2+1)^2} = \frac{y^2 - x^2 + 1}{(x^2+y^2+1)^2}, \frac{\partial f}{\partial y} = \frac{-2xy}{(x^2+y^2+1)^2}$.

Both are quotients of polynomials whose denominators never hit zero, so each partial is continuous everywhere (ℝ²). By the standard theorem (continuous partials ⇒ differentiable), $f \in C^{\infty}$ and therefore differentiable everywhere.

Trigonometric–hyperbolic blend, $f(x, y)=\sin(x)\cosh(y)$. Domain: ℝ². sin, cos, sinh, cosh are all $C^∞$ on $\mathbb{R}$. Products of $C^∞$ functions are $C^∞$.

Partials: $\frac{\partial f}{\partial x} = \cos(x) \cosh(y), \quad \frac{\partial f}{\partial y} = \sin(x) \sinh(y)$, hence $C^∞$

$C^\infty$ Functions

$C^\infty$ denotes the class of functions that are infinitely differentiable (or “smooth”) on their domain. A function f is in C^∞ if for every non-negative integer k, the k-th derivative of f exists and is a continuous function.

In simpler terms, no matter how many times you differentiate a $C^\infty$ function, you will never encounter a point where the derivative fails to exist or becomes discontinuous.

Key Properties and Intuition

Existence of All Derivatives: The defining feature of a $C^\infty$ function is that derivatives of every order exist.
Continuity of All Derivatives: Not only do these derivatives exist, but they are also continuous. The continuity of derivatives ensures that the function’s behavior is “uniformly” predictable at every order of approximation.
Smoothness Hierarchy: Being $C^\infty$ is the pinnacle of the differentiability hierarchy:
(i) $C^0$: Continuous functions. (ii) $C^1$: Functions with continuous first derivatives (continuously differentiable).
(iii) $C^2$: Functions with continuous second derivatives. […]
$C^\infty$: Functions with continuous derivatives of all orders. It implies the original function is continuous and differentiable as many times as you like.

As we move up the hierarchy, the functions become progressively more well-behaved.

Standard Theorems for Building Smooth Functions.

Arithmetic Operations: If $f$ and $g$ are $C^\infty$ functions on a common open set, then the following are also $C^\infty$:
Sum/Difference: $f + g$ and $f - g$
Product: $f \cdot g$
Quotient: $f / g$, provided $g(x) \neq 0$ on the domain (since division introduces a singularity where $g=0$).
Chain Rule (Composition): If $f$ is $C^\infty$ on an open set $U \subseteq \mathbb{R}^n$ and $g$ is $C^\infty$ on an open set $V \subseteq \mathbb{R}^m$ such that $f(U) \subseteq V$, then the composition $g \circ f$ is $C^\infty$ on $U$. In other words, smoothness is preserved under composition.
Linearity: The derivative operator $D$ is linear. For any constants $a, b \in \mathbb{R}$ and $C^\infty$ functions $f$ and $g$, $D(af+bg)=aDf+bDg.$ This extends to all higher-order derivatives as well.

Local vs. Global Behavior: A function can be $C^\infty$ on an open set but behave badly outside it, or even fail to be defined. For example, $f(x) = 1/x$ is $C^\infty$ on $(0, \infty)$, even though it has a singularity at $x=0$. The definition is always relative to a specific domain.

$C^\infty$ vs. Analytic Functions ($C^\omega$).

A function is $C^{\infty }$ if (i) every derivative exists; (ii) every derivative is continuous.
However, that’s all it guarantees. There is no requirement that the Taylor series tells you anything meaningful about the function’s actual values.
$f(x)=\left\{ \begin{array}{ll}\textstyle e^{-1/x^2},&\textstyle x>0,\\ \textstyle 0,&\textstyle x\leq 0. \end{array}\right.$
This function is: infinitely differentiable everywhere; all derivatives at 0 are 0.
$f'(x) = e^{-1/x^2}\cdot \frac{2}{x^3}$. As $x \to 0^+, e^{-1/x^2}$ decays faster than any polynomial, so $\lim_{x \to 0^+}f'(x) = 0$. Thus, f’(0) = 0.
Each derivative for x > 0 involves terms like $e^{-1/x^2} \cdot \text{polynomial in 1/x}$. The exponential decay dominates polynomial growth, forcing all right-hand limits to 0. Left-hand limits are trivially zero.
Hence, $f^{(n)}=0, \forall n$.
So the Taylor series at 0 of f(x) is $\sum_{n = 0}^\infty \frac{f^{(n)}}{n!}x^n = 0$, since all derivatives at zero are zero.
However, $f(x) \ne 0 \forall x \gt 0$. The Taylor series fails to converge to f(x) for any $x \ne 0$, even though f is smooth everywhere.
A real analytic function is one that: (i) Is $C^{\infty}$, and (ii) equals its Taylor series in a neighborhood of every point.
Knowing all derivatives at a single point determines the entire function nearby.
In other words, at every point, you can approximate it perfectly with an infinite polynomial.
This is a much stronger condition. Analytic functions are rigid: if two analytic functions agree on any open interval, they agree everywhere on the connected domain.
Examples of Smooth Functions: Polynomials ℝⁿ; eˣ ℝ; sin x, cos x ℝ; sinh x, cosh x, ℝ; log x (0, ∞); xᵅ (α > 0) (0, ∞); and rational functions where defined.

Counterexamples

The existence of all partial derivatives at a point does NOT guarantee differentiability. Differentiability requires the function to be well-approximated by a LINEAR map in ALL directions simultaneously.

The Classic Non-Differentiable Function. Let $f(x, y)=\begin{cases} \dfrac{xy}{\sqrt{x^{2}+y^{2}}} & (x,y)\neq(0,0) \\\\ 0 & (x,y)=(0,0) \end{cases}$

Continuity at the origin (0, 0). To verify continuity, compute the limit as (x,y)→(0,0): $\lim_{(x, y)→(0,0)} \dfrac{xy}{\sqrt{x^{2}+y^{2}}}$. Switch to polar coordinates x = rcosθ, y = rsinθ. $\lim_{(x, y)→(0,0)} \dfrac{xy}{\sqrt{x^{2}+y^{2}}} = \lim_{(x, y)→(0,0)} \dfrac{r^2\cos(\theta)\sin(\theta)}{r} = \lim_{(x, y)→(0,0)} r\cos(\theta)\sin(\theta) = 0 = f(0, 0),$ confirming continuity at (0, 0).
Compute Partial Derivatives at Origin: $\frac{\partial f}{\partial x}(0, 0) = \lim_{h \to 0} \frac{f(h, 0) - f(0, 0)}{h} = \lim_{h \to 0} \frac{0}{h} = 0, \frac{\partial f}{\partial y}(0, 0) = \lim_{h \to 0} \frac{f(0, h) - f(0, 0)}{h} = \lim_{h \to 0} \frac{0}{h} = 0$. Both partials exist and equal 0. Thus, $\nabla f(0,0)=(0,0).$
Directional Derivatives at the Origin. The directional derivative of a function f at a point in a given direction measures the instantaneous rate of change of f along that direction. It’s a generalization of the partial derivative, which measures the rate of change in the directions of the coordinate axes. Essentially, it tells you how much the function’s value will change if you move a small amount in a specific direction from a given point. $D_uf(a, b) = \lim_{h->0} \dfrac{f(a + hu, b + kv) - f(a, b)}{h}$ where u = is a unit vector.
In our case, the directional derivative of f at (a, b) = (0, 0) in direction u = (a, b) (with $\sqrt{a^2+b^2}=1$) is: $D_uf(0, 0) = \lim_{h->0} \dfrac{f(ha, hb) - f(0, 0)}{h} = \lim_{h->0} \dfrac{(ha)(hb)}{h\sqrt{(ha)^2+(hb)^2}} = \lim_{h->0} \dfrac{h^2ab}{h^2\sqrt{a^2+b^2}} = \lim_{h->0} \dfrac{ab}{\sqrt{a^2+b^2}} = \dfrac{ab}{\sqrt{a^2+b^2}} = ab$. This limit exists for all directions u, so directional derivatives exist everywhere at (0, 0).

This is already suspicious: a true derivative must be a linear function of u, but ab is not linear.
Failure of Total Differentiability. For f to be differentiable at (0,0), the directional derivatives must be linear in u and agree with the total derivative Df(0,0) ↭ A function f is differentiable at (0,0) only if there is a linear map L (the Jacobian) such that L(u) = Dᵤf(0, 0) = $\nabla f(0, 0)\cdot u$ for every unit direction u= (a, b) with a² + b² = 1.
The directional derivative ab is not linear (the map $\mathbf{u} \mapsto D_{\mathbf{u}}f(0,0)$ must be linear for it to represent a true derivative) in u = (a, b) with a² + b² = 1.
Homogeneity test: Consider a scalar c∈ℝ and direction cu = (ca, cb). The unit vector in this direction is $\frac{(ca, cb)}{||cu||}=\dfrac{(ca, cb)}{|c|\sqrt{a^2+b^2}}=(\dfrac{ca}{|c|}, \dfrac{cb}{|c|})$. The directional derivative is $D_{cu}f(0, 0) = \dfrac{ca}{|c|}·\dfrac{cb}{|c|} = \dfrac{c^2ab}{|c|^2} = ab$. Linearity requires . $D_{cu}f(0, 0) = c·D_{u}f(0, 0) = c(ab)$. This holds only if ab=c(ab) for all c, which is false unless ab=0.
For example. If u = $(\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}}), D_{u}f(0, 0) = \frac{1}{\sqrt{2}}·\frac{1}{\sqrt{2}} = \frac{1}{2}$. For c = 2, $D_{2u}f(0, 0) = \frac{1}{2} \ne 2·\frac{1}{2}=1$. Hence, Homogeneity fails.
Additivity test. A linear map must have $L(u_1 + u_2)=L(u_1)+L(u_2),$ for u₁ = (1, 0) and u₂ = (0, 1). u₁ + u₂ = (1, 1). The unit vector is $(\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}}), D_{u_1 + u_2}f(0, 0) = \dfrac{1}{\sqrt{2}}·\dfrac{1}{\sqrt{2}} = \frac{1}{2}.$ Additivity requires $D_{u_1 + u_2}f(0, 0) = D_{u_1}f(0, 0) + D_{u_2}f(0, 0), \text{ but } \frac{1}{2} \ne 0 + 0,$ hence additivity fails, too.
Agreement with the Total Derivative. The directional derivatives must agree with the total derivative Df(0, 0). However,

Partial derivatives at (0,0) are zero: $\dfrac{∂f}{∂x} = f_x(0, 0) = \lim_{h->0} \dfrac{f(h, 0) - f(0, 0)}{h} = \lim_{h->0} \dfrac{0}{h} = 0, \text{similarly} \dfrac{∂f}{∂y} = 0$. Thus, the gradient ∇f(0,0) = (0, 0). If f were differentiable at (0, 0), the total derivative would be Df_u(0, 0) = ∇f(0,0)⋅u=(0,0)⋅(a,b)=0 for all u. However, we have already computed, u = $(\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}})$, Df_u(0, 0) = 1/2 ≠ 0 ⊥

Differentiable Functions: A Rigorous Perspective