Linear Algebra for Programmers

4. Rank and Reversibility

4.1. Let's talk about unique values in an array

How can you find the number of unique values in an array of numbers? What if you have an array of vectors of a fixed dimension? Recall that a vector of dimension d is just an array of d numbers. This leads to the question: how is uniqueness defined?

4.1.1. Duplicate numbers: what makes two numbers unique?

The answer is obvious here: two numbers are unique if and only if they are not equal. In this case, the solution may look something like (without worrying about algorithmic complexity):

def get_number_unique_elements(array):
    element_to_count_dict = {}
    for element in array:
        element_to_count_dict[element] = element_to_count_dict.get(element, 0) + 1
    return len(element_to_count_dict)

4.1.2. Duplicate Vectors: What Makes Two Vectors Unique?

| We can think of two vectors to be unique if and only if they point in different directions.

Note that this is not a definition of uniqueness of vectors. But it emphasizes the fact that if two vectors point in the same direction, they are just scaled versions of each other - in that sense, they are not unique.

Consider two arrays [2,3] and [6,9]. They point in the same direction so they are not considered unique. However [2,3] and [2.5,3] are considered unique.

Let's see how this looks visually: in an array of vectors [[2,3],[-3,-4.5],[3,4.5],[3,2]], we only have two unique directions i.e. two unique vectors.

Only two of the four vectors can be considered unique i.e. they have unique directions

A possible way to find the number of unique vectors is this:

def are_two_vectors_unique(v1, v2):
    assert len(v1) == len(v2)
    if len(v1) == 1: return False 
    scale = v1[0] / v2[0]
    for (x,y) in zip(v1[1:], v2[1:]):
        if x/y != scale: return True
    return False
    
def get_number_of_unique_vectors(array_of_vectors):
    unique_vectors = []
    for i in range(len(array_of_vectors)):
        vector_i = array_of_vectors[i]
        vector_i_is_unique = True
        for j in range(i+1, len(array_of_vectors)):
            vector_j = array_of_vectors[j]
            if not are_two_vectors_unique(vector_i, vector_j):
                vector_i_is_unique = False
                break
        if vector_i_is_unique:
            unique_vectors.append(vector_i)
    return len(unique_vectors)

This function doesn't take care of all possible edge cases, floating point errors and things like zero-division but let's not worry about it.

4.2. What is the dimension of the space covered by weighted sums of vectors?

Now let's jump to another question which may seem unrelated to the previous question at first but is actually not: Given a list of vectors, what is the dimension, r, of the space covered by weighted sums of vectors?

We define a space as a collection of infinitely many points where each point is a possible output of a weighted sum of vectors. Is r the same as the number of unique directions of vectors of which we are taking weighted sums? Given vectors \( v_1, v_2, ..., v_n \), the set of points \( c_1v_1 + c_2v_2 + ... + c_nv_n \) for all possible values of \( c_1, c_2, ..., c_n \) is space formed by all possible weighted sums of \( v_1, v_2, ..., v_n \). Let's look at some properties of this dimension r first:

4.2.1. It cannot be greater than the dimension of the vectors being added up

This may seem like an obvious fact: you cannot get a 3-dimensional vector by adding two or more 2-dimensional vectors. Thus the dimensionality of the output space cannot be greater than two. Can it be smaller than two? Absolutely! A weighted-sum of a 2-dimensional vector is just a line. Eg the weighted sum of the vector [2,3] with a weight w is a vector [2w,3w]. As you change the value of w, you will get different points on the line 3x - 2y = 0.

The visuals below show you these two facts:

The weighted sum of one vector [2,3] is a 1-dimensional space: the line 3x-2y=0

Change the weight w = 0.50:

No matter how many 2-dimensional vectors we add, we still remain in the 2-d plane.

Here we observe something interesting: even if we have 3 or more unique vectors (i.e. pointing in unique directions), each of dimension 2, we can only cover the entire 2-dimensional space by taking their weighted sums. But this is something we can do with two vectors only (eg [1,0] and [0,1]).

4.2.2. It cannot be greater than the number of vectors being added up

This may not be obvious at first but it will start to make sense later: if you add three 100-dimensional vectors weighed (scaled) in all possible manners, you will get a space which is at most three dimensional. Let's consider some examples to understand:

1. The output space of weighted sums of one 3-dimensional vector is a one-dimensional space - a line in 3-dimensional space. This is easy to visualize - the space is basically just one direction - the direction of the input vector. There is no other unique direction in this space.

2. The output space of weighted sums of two 3-dimensional vectors which point in the same direction is a one-dimensional space - it is a line in 3-dimensional space. Eg the sum of w1 * [1,2,2] and w2 * [-2,-4,-4] is [c,2c,2c] where c = w1 - 2*w2. There is only one unique direction in the output space - the direction of the two input vectors.

3. The output space of weighted sums of two unique 3-dimensional vectors is a two-dimensional space - it is a plane in 3-dimensions. For example, if you add vectors w1 * [1,0,0] and w2 * [0,1,0], the output vector is the space [w1, w2, 0]. No matter how you add these two vectors, you can never get to a vector [1,1,1].

4. Similarly if you take three vectors \( v_1, v_2, v_3 \) of any dimension n > 3, all possible weighted sums of these three vectors will form a 3-dimensional space.

4.3. Rank of a matrix

The dimensionality of all possible weighted sums of vectors in an array is a property of that array of vectors. Note that the array of vectors of the same dimension is nothing but a matrix. We call this property the rank of that matrix. For example, the array of vectors [[2,3],[-3,-4.5],[3,4.5],[3,2]] in the matrix form is:

\( \begin{bmatrix} 2 & -3 & 3 & 3 \\ 3 & -4.5 & 4.5 & 2 \end{bmatrix} \)

This matrix has rank 2 as we saw earlier.

Similarly the below matrices have rank 1:

\( \begin{bmatrix} 2 & -3 \\ 3 & -4.5 \end{bmatrix} \), \( \begin{bmatrix} 2 & -3 \\ 3 & -4.5 \\ 0 & 0 \end{bmatrix} \), \( \begin{bmatrix} -1 & 2 & 0.5 \\ -2 & 4 & 1 \\ 4 & -8 & -2 \end{bmatrix} \)

4.4. Linearly (in)dependent vectors

Given vectors \( v_1, v_2, ..., v_n \), we want to find weights \( w_1, w_2, ..., w_n \) such that the weighted sum \( w_1v_1 + w_2v_2 + ... + w_nv_n = 0 \). In other words:

\( \begin{bmatrix} v_1 & v_2 & ... & v_n \end{bmatrix} \)\( \begin{bmatrix} w_1 \\ w_2 \\ ... \\ w_n \end{bmatrix} \)\(=\)\( \begin{bmatrix} 0 \\ 0 \\ ... \\ 0 \end{bmatrix} \)

Note that we can always find at least one solution: setting all weights \( w_1, w_2, ..., w_n \) to zero. This is called the trivial solution.

• If this is the only solution for \( w_1, w_2, ..., w_n \), the vectors \( v_1, v_2, ..., v_n \) are said to be linearly independent.

• If there exists a non-trivial solution (i.e. at least one of the weights is non-zero), the vectors are said to be linearly dependent.

4.4.1 Why the word "dependent"?

Let's say \( w_i ≠ 0 \). Then we can write the equation \( w_1v_1 + w_2v_2 + ... + w_nv_n = 0 \) as \( w_iv_i = \sum_{k≠i}{w_kv_k} \) or \( v_i = \sum_{k≠i}{(w_k/w_i)v_k} \).

Thus the vector \( v_i \) can be expressed as weighted sum of the remaining vectors. We can think of the vector \( v_i \) as not being independent of the remaining vectors.

In this scenario, the space of weighted sums of all vectors except \( v_i \) is the same as the space of weighted sums of all vectors. In other words, adding a new vector \( v_i \) does not increase the dimensionality of the space of all possible weighted sums.

But if the trivial solution is the only possible solution, it means that the vector \( v_i \) is independent of the remaining vectors. In this case, the dimensionality of the space of all possible weighted sums of vectors \( v_1, v_2, ..., v_{i-1}, v_{i+1}, ..., v_n \) is smaller than the dimensionality of space of all possible weighted sums of vectors \( v_1, v_2, ..., v_n \) by a value of 1.

4.4.2 Relationship between linear independence and rank of a matrix

Given a matrix whose columns are the vectors \( v_1, v_2, ..., v_n \), the rank of the matrix is equal to the largest subset of S of the vectors \( v_1, v_2, ..., v_n \) such that the vectors in S are linearly independent.

Consider the matrix \( \begin{bmatrix} 2 & -3 & 3 & 1 \\ 3 & -4.5 & 2 & 1 \end{bmatrix} \). The largest subset S of the column vectors of this matrix such that the columns of S are linear independent can only contain two vectors. Eg \( \begin{bmatrix} 2 & 1 \\ 3 & 1 \end{bmatrix} \), \( \begin{bmatrix} 3 & -3 \\ 2 & -4.5 \end{bmatrix} \), \( \begin{bmatrix} 2 & 3 \\ 3 & 2 \end{bmatrix} \). Thus the rank of this matrix is 2.

4.5. Building a case for square matrices

Consider the following weighted sum of three 2-dimensional vectors:

\( \begin{bmatrix} 4 & -2 & -1 \\ 1 & 1 & -4 \end{bmatrix} \) \( \begin{bmatrix} 0.8 \\ 2 \\ 1.3 \end{bmatrix} \) = \( \begin{bmatrix} -2.1 \\ -2.4 \end{bmatrix} \)

Modify 1st weight: 0.80:

Modify 2nd weight: 2.00:

Modify 3rd weight: 1.30:

We can actually get to any point in the output space by using just two vectors. As long as the two vectors are unique (or in other words, as long as the 2-by-2 matrix has rank 2), we can get to any point in the 2d plane using just two vectors. A few examples of such matrices are shown below.

\( \begin{bmatrix} 1 & -2 \\ -2 & -1 \end{bmatrix} \) \( \begin{bmatrix} 0.54 \\ 1.32 \end{bmatrix} \) = \( \begin{bmatrix} -2.1 \\ -2.4 \end{bmatrix} \)

\( \begin{bmatrix} 2 & 1 \\ 0.5 & -1 \end{bmatrix} \) \( \begin{bmatrix} -1.8 \\ 1.5 \end{bmatrix} \) = \( \begin{bmatrix} -2.1 \\ -2.4 \end{bmatrix} \)

\( \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \) \( \begin{bmatrix} -2.1 \\ -2.4 \end{bmatrix} \) = \( \begin{bmatrix} -2.1 \\ -2.4 \end{bmatrix} \)

In other words, a weighted sum of three or more 2-dimensional vectors can also be represented as a weighted sum of two 2-dimensional vectors (which is a 2-by-2 matrix). Similarly any weighted sum of four or more 3-dimensional vectors can be represented by a weighted sum of three 3-dimensional vectors (which is a 3-by-3 matrix). In this sense, these 2-by-2 and 3-by-3 matrices seem to be "efficient". An n-by-n matrix contains just enough columns to scale-and-add to be able to reach any point in the n-dimensional space.

4.5.1. Input and output vectors of square matrices have same dimensionality

If we consider the matrix as a function, a n-by-n matrix takes as input a n-dimensional vector and gives as output a n-dimensional vector. In other words, it maps a point in n-dimensional space to another point in n-dimensional space. For example, the matrix \( \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix} \) maps the point \( \begin{bmatrix} 1 \\ 1 \end{bmatrix} \) to \( \begin{bmatrix} 3 \\ 3 \end{bmatrix} \). This allows us to draw the input and the output points in the same 2-dimensional plane.

4.5.2. Drawing input and output points in the same plane

Since both the input and the output points lie in the same 2d plane, we can draw them in the same plane. We can draw an arrow from the input point to the output point to show the transformation by the matrix. The previous example can be visualized like so:

The matrix \( \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix} \) maps \( \begin{bmatrix} 1 \\ 1 \end{bmatrix} \) to \( \begin{bmatrix} 3 \\ 3 \end{bmatrix} \).

It can be shown as an arrow from the input point to the output point.

We can also do this for not just but one but a set of input points. Again, for each input point, we calculate the output point and draw an arrow from the input point to the output point. In the visual below, we sample points on a circle:

Arrows showing how the input points on a circle are mapped to points on an ellipse by the matrix \( \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix} \)

4.6. Transforming output vector back into input vector

We can ask the question: what matrix brings \( \begin{bmatrix} 3 \\ 3 \end{bmatrix} \) back to \( \begin{bmatrix} 1 \\ 1 \end{bmatrix} \)?

Whatever this matrix is, it inverts the action of the matrix M = \( \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix} \) (which sent the vector \( \begin{bmatrix} 1 \\ 1 \end{bmatrix} \) to the vector \( \begin{bmatrix} 3 \\ 3 \end{bmatrix} \)). We call this matrix the inverse of matrix \(M\) and write it as \(M^{-1}\) (pronounced M inverse).

4.6.1. Matrix inverse is simply inverting the arrows

If we use arrows from input points to output points to visualize these transformations, the effect of an inverse matrix can be seen as simply changing the direction of the arrows:

Matrix \(M\) maps \( \begin{bmatrix} 1 \\ 1 \end{bmatrix} \) to \( \begin{bmatrix} 3 \\ 3 \end{bmatrix} \)

Matrix \(M^{-1}\) maps \( \begin{bmatrix} 3 \\ 3 \end{bmatrix} \) to \( \begin{bmatrix} 1 \\ 1 \end{bmatrix} \)

Matrix \(M\) maps points on a circle to points on an ellipse

Matrix \(M^{-1}\) reverses that action and maps those points on an ellipse to points on a circle

4.6.2. When is it possible?

A function can have the same output for two different inputs. But it can never have two different outputs for the same input. Also remember that the matrix is just a function.

Now, if a matrix sends two input points p1 and p2 to the same output point q, its inverse will send that output point to q to p1 and p2 both. But that is not possible since an input point cannot have two different output points. Thus the inverse does not exist.

Now comes the interesting part: if a 2-by-2 matrix has rank 2, it will always maps two different input points to two different output points. But if it has rank 1, there exists a set of points which is always mapped to the same output point (the origin).

In fact, a rank-one 2-by-2 matrix squishes the entire 2d plane into a line:

A rank one matrix maps points on a circle to points on a line

\( \begin{bmatrix} 1.04 & -0.89 \\ -0.89 & 0.76 \end{bmatrix} \)

In fact, a rank one matrix maps the entire 2d space onto a line

Thus an inverse does not exist for a rank one 2-by-2 matrix. In general, for a n-by-n matrix, an inverse exists if and only if the rank of that matrix is n. Also, inverse can only exist for square matrices. The reason is the same - non-square matrices always map a set of points to the origin.

4.6.3. Inverse of a 2-by-2 matrix

The inverse of a matrix \( M = \begin{bmatrix} a & c \\ b & d \end{bmatrix} \) is \( M^{-1} = \frac{1}{ad-bc} \begin{bmatrix} d & -c \\ -b & a \end{bmatrix} \).

Thus the inverse of \( \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix} \) is \( \begin{bmatrix} 0.67 & -0.33 \\ -0.33 & 0.67 \end{bmatrix} \). You can verify that \( \begin{bmatrix} 0.67 & -0.33 \\ -0.33 & 0.67 \end{bmatrix} \)\( \begin{bmatrix} 3 \\ 3 \end{bmatrix} \) = \( \begin{bmatrix} 1 \\ 1 \end{bmatrix} \)

4.7. Another term for weighted sums of vectors

We have been and will be using the term weighted sums of vectors quite a lot here. This is to emaphasize the fact that a matrix-vector product is just a weighted sum of columns of the matrix. But in general, this action is more commonly described as taking a linear combination of vectors. We will use both these phrases interchangeably.

In the next section, we will use this knowledge of rank and inverse of a matrix to solve an interesting problem - determining whether a point is inside or outside a triangle.

← 3) Matrix is a function

5) Barycentric coordinates →