Series: Matrix calculus

Jacobian of a vector-valued function

Gradient of a function is defined only for scalar-valued functions. What if the output is also vector-valued? The idea is to view a $ n \ \text{dimensional} $ output as $ n $ number of $ 1 \ \text{dimensional} $ outputs.

0. Key idea

Let $ f: R^n \rightarrow R^2 $ be a vector-valued function. One can actually think of it as two scalar-valued functions $ f_1, f_2: R^n \rightarrow R $. Gradient is defined for the functions $ f_1, f_2 $ since they are scalar-valued.

$ f:R^n \rightarrow R^2 $ can be seen as two scalar-valued functions $ f_1, f_2:R^n \rightarrow R $

Thus a small change in the output of $ f $ can be shown as a small change in the output of $ f_1, f_2 $ for a small change in input $ dx $. \[ \begin{align} df &= \begin{bmatrix} df_1 \\ df_2 \end{bmatrix} \\ &= \begin{bmatrix} \langle \nabla f_1, dx \rangle \\ \langle \nabla f_2, dx \rangle \end{bmatrix} \\ &= \underbrace{\amber{\begin{bmatrix} ⎯ & \nabla f_1^T & ⎯ \\ ⎯ & \nabla f_2^T & ⎯ \end{bmatrix}}}_{\amber{\text{Jacobian of} f}} \begin{bmatrix} | \\ dx \\ | \end{bmatrix} \end{align} \] Thus, the Jacobian matrix can be viewed as a matrix of gradients of scalar-valued functions. However a more abstract view of Jacobian is to look at it in the following way:$$ \underbrace{\text{change in output}}_{\rose{R^{\text{out}}}} = \underbrace{\text{Jacobian}}_{\amber{R^{\text{out}}, R^\text{in}}} \times \underbrace{\text{change in input}}_{\rose{R^\text{in}}} $$

It is applicable for small changes in input and output.

Thus the shape of the Jacobian matrix is $ \amber{(\text{output dim}, \text{input dim})} $.

1. Properties of Jacobian

1.1 Jacobian of a composition of functions

Given two functions $ f: R^a \rightarrow R^b, g: R^b \rightarrow R^c $, the Jacobian of $ g \circ f $ is given by:$$ \underbrace{\underbrace{\text{Jacobian of} \ g}_{\rose{R^c, R^b}} \times \underbrace{\text{Jacobian of} \ f}_{\rose{R^b, R^a}}}_{\amber{R^c, R^a}} $$

Some interesting properties of Jacobian of a function $ f: R^n \rightarrow R^n $ (input and output dims are the same) are mentioned below.

1.2 Jacobian determinant

The determinant of a Jacobian matrix provides an idea of how the space around a point is stretched or compressed. If the determinant is non-zero, it means the function is invertible at that point. If it is positive, the orientation of the space is preserved. The absolute value of the determinant is also used extensively in normalizing flows.

1.3 Jacobian inverse

If a function $ f: R^n \rightarrow R^n $ is invertible at a point $ x \in R^n $, then the Jacobian of its inverse $ f^{-1} $ is the inverse of the Jacobian of $ f $ at $ x $. It also implies that the determinant of the Jacobian of $ f^{-1} $ is the inverse of the determinant of the Jacobian of $ f $.

← 1) Gradient of a function

3) Key matrix derivatives →