3. A Different Way to Look at Matrix Vector Multiplication

In the previous sections, we emphasized on how a matrix-vector multiplication y = Mx is nothing more than a weighted sum of columns of the matrix M. We will change this perspective a little bit now - the new perspective offers insights which may not be obvious using the "weighted-sum-of-columns" view.
3.1. Matrix is a function
Consider the matrix-vector product y = Mx. In this new perspective, you can consider a matrix M as a function. The input of this function is the vector x and the output is the vector y. This shift in perspective can also be expressed programmatically as:
3.1.1. A weighted-sum view of matrix-vector multiplication
def matrix_vector_multiplication(matrix, x):
    y = add( scale(column, weight) for (column, weight) in zip(matrix, x) )
    return y
3.1.2. A matrix-as-a-function view of matrix-vector multiplication
def matrix(x):
    # you can think of matrix_columns as an internal state of this function
    y = add( scale(column, weight) for (column, weight) in zip(matrix_columns, x) )
    return y
3.1.3. Input type of this function
The function matrix takes as input a vector (array) x whose number of elements is equal to the number of columns in the matrix.
3.1.4. Output type of this function
The function matrix returns as output a vector (array) y whose number of elements is equal to the number of elements in each column of the matrix.
Another way of saying this is that a matrix of shape (r,c) where r = number of rows and c = number of columns takes as input a vector of shape (c,1) and returns as output a vector of shape (r,1).
3.2. Expanding the perspective of matrix-as-a-function
Note that a vector is just a point in space. In our new perspective, a r-by-c matrix (a row-by-column matrix) maps the (input) point x in a c-dimensional space to an (output) point y in a r-dimensional space. We want to study how a matrix maps a set of points from the input space (which is c-dimensional) to the output space (which is r-dimensional), not just one point. Note that if the matrix is 2-by-2, its input is a two-dimensional vector and its output is also a two-dimensional vector.
3.3. Some Transformations done by a matrix
We will see how a matrix maps a set of points arranged in an orderly fashion. We will also illustrate this using a 2-by-2 matrix.
3.3.1. Line → Line
If the input points lie in a straight line, the output points also lie in a straight line. In the example below, we show five points arranged in a line in a 2D plane. You can see how they continue to stay in a straight line no matter what the matrix is.
\( \begin{bmatrix} 0.5 & -0.5 \\ -0.8 & -1 \end{bmatrix} \)
Matrix that transforms the input points
Input points arranged along the line:
\( \frac{x}{2.0} + \frac{y}{3.0} = 1 \)
Output points are also arranged in a straight line
Modify the line:
Instead of saying "a matrix maps a set of points in a straight line to another set of points in a straight line", we say "a matrix maps a straight line to another straight line". We can also visualize this subtle change by showing the set of points not as vectors but as a straight line:
\( \begin{bmatrix} 0.5 & -0.5 \\ -0.8 & -1 \end{bmatrix} \)
A matrix transforms a straight line to another straight line
The input line
\( \frac{x}{2.0} + \frac{y}{3.0} = 1 \)
The output line
3.3.2. Parallel Lines → Parallel Lines: Scale equivariance
Two (or more) parallel lines (or curves or any random shapes) remain parallel even after being transformed by a matrix. Changing the scale of an input point (and by extension a set of points like a line, a curve etc) also changes the scale of the output point by the same value. This is obvious algebriacally: M(cx) = c(Mx) where c is a constant, x is the input vector and M is a matrix.
How the scale varies in the input space is the same as (or equal to) how the scale varies in the output space. We call this property scale equivariance.
This begs a few questions:
- Is the matrix transformation translation equivariant? In other words, does translating the input point by a certain value also translate the output point by the same value?
- What about rotation? Does rotating the input point by a certain angle also rotate the output point by the same angle?
- If two parallel lines remain parallel, does it mean the angle between two non-parallel lines also remains the same before and after a matrix transformation (for any matrix)?
We can get the answers to these questions (all the answers are No by the way) and also visualize how parallel lines remain parallel by looking at how a trapezium is transformed by a matrix:
\( \begin{bmatrix} 0.8 & -0.3 \\ -0.8 & -0.6 \end{bmatrix} \)
Matrix that transforms the trapezium
A trapezium that is input to the matrix.
The transformed trapezium.
Note how the parallel sides remain parallel.
Also note how translating the input also translates the output.
Note that translation does not change the shape of the input - it simply translates the input points in a direction that is different from the direction of translation in the input space.
3.3.3. Circle → Ellipse
If the input points lie on a circle, the output points lie on an ellipse. The shape of the ellipse depends only on the matrix, not on the radius of the circle. The radius of the circle only determines the size of the ellipse. We will explore this property in detail later on.
We can describe this as - "a matrix maps a circle to an ellipse".
\( \begin{bmatrix} 0.8 & -0.3 \\ -0.8 & -0.6 \end{bmatrix} \)
A matrix maps (points on) a circle to (points on) an ellipse
Input points lying on the circle.
Change how the input is displayed:
Output points lie on an ellipse.
3.3.4. Generalizing Point → Point to Curve → Curve and Space → Space
You may have noticed a pattern here - we began with saying that a matrix maps an input point to an output point. But soon we began to say that a matrix maps a curve (line, circle etc) to another curve. The matrix is still mapping points on that curve to points on another curve. But we have abstracted our view into think how a curve is mapped to another curve by a matrix. We can generalize this to any area in a 2D place or a volume in a 3D space (or a higher dimensional space) - we want to see how this area or volume is transformed by a matrix, which provides some beautiful insights. Let us see some examples:
3.3.5. Convex Shape → Convex Shape
Any convex shape remains convex even after a matrix transformation. In the example below, you can see how a convex polygon remains convex no matter what the matrix is.
A (non-rigorous?) definition of convex shape: a shape is convex if for any two points inside the shape, the entire line connecting those two points is also inside the shape.
\( \begin{bmatrix} 0.32 & 0.26 \\ -0.81 & -0.16 \end{bmatrix} \)
The matrix that transforms a convex polygon.
Generate a random matrix:
A convex polygon that is input to a matrix.
Generate a random convex polygon:
The output polygon is also convex!
3.3.6. Mapping the entire input space!
How does the entire 2D place get transformed by a matrix? We can visualize this for a 2-by-2 matrix. Our input space is the standard cartesian plane with a x-axis, a y-axis and a grid of horizontal and vertical lines. We want to see how all these lines are transformed by a matrix:
A couple of things to note here:
1. The vector [1,0] is mapped to the first column of the matrix.
2. The vector [0,1] is mapped to the second column of the matrix.
This also applies to a matrix of any shape (m, n)
1. The vector [1,0,...,0] is mapped to the first column of the matrix.
2. The vector [0,1,...,0] is mapped to the second column of the matrix.
...
n. The vector [0,0,...,1] is mapped to the n-th column of the matrix.
\( \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \)
The transformation matrix
Generate a random matrix:
Input space with the basis vectors [1,0] and [0,1]
This is what the transformed space looks like. Note how
[1,0] is mapped to the first column of the matrix [1.00,0.00]
[0,1] is mapped to the second column of the matrix [0.00,1.00]
3.4. What Happens if the two columns of a matrix point in the same direction?
We will explore this question in detail in the next section. For now, let us just look at an example of a 2-by-2 matrix. Look at what the transformed space looks like as the two columns of the matrix start pointing in similar directions:
Use the buttons below to:
• scale the columns
• rotate the first column of the transformation matrix
\( \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \)
Notice the columns of the matrix.
Also note how the space is squished as they start pointing in the same direction
In the next section we will study this squishing in more detail.