1. Let's talk about weighted sums

Let's start with calculating a weighted sum for an array of numbers x = \( [x_1, x_2, x_3] \) where the weights are given by the array w = \( [w_1, w_2, w_3] \)
\( weighted\_sum(x,w) = w_1 * x_1 + w_2 * x_2 + w_3 * x_3 \)
def get_weighted_sum(values, weights):
    return sum((v*w) for (v,w) in zip(values, weights))
It can be seen as a two step process:
a) we scale each value by its corresponding weight
b) we add the scaled values up
def scale(value, weight):
    return value * weight

def add(scaled_values):
    return sum(scaled_values)

def get_weighted_sum(values, weights):
    return add(scale(v, w) for (v, w) in zip(values, weights))
You might wonder what do we achieve by re-writing the original get_weighted_sum function. One thing we achieve is separation of logic for scaling and adding the values up. For instance, you can provide your own definitions of scaling and adding and pass them as parameters to get_weighted_sum:
def get_weighted_sum(values, weights, *, scale_function, add_function):
    return add_function(
        scale_function(v, w) 
        for (v, w) in zip(values, weights)
    )
1.1. How to scale and add
Now we can simply focus on different ways to scale and add two values together. You already saw the scale and add functions above. But there are many other possible ways to scale and add. One way is described below (this leads to the exciting area of Tropical Geometry):
def tropical_scale(value, weight):
    return value + weight

def tropical_add(scaled_values):
    return max(scaled_values)

weighted_sum = get_weighted_sum(
    [1,2,3], [6,1,3], 
    scale_function = tropical_scale, 
    add_function = tropical_add
)
However let us stick to the conventional notions of scaling and adding: we scale two numbers by multiplying them - scale(x,y) = x*y; and we add two numbers by summing them up - add(x,y) = x+y. But we are going to get creative with what to scale and add. Instead of numbers, we will scale and add arrays of numbers:

So how do we scale an array array_of_numbers with a given weight weight? We can scale each value in the array with the given weight.
def scale_array(array_of_numbers, weight):
    return [number * weight for number in array_of_numbers]
How do we add two or more arrays of numbers? We simply do an elementwise addition (assuming all arrays have the same length)
def add_arrays(array_of_arrays):
    result = [0] * len(array_of_arrays[0])
    for array in array_of_arrays:
        result = [(r+v) for (r,v) in zip(result, array)]
    return result
Then we can pass these functions to get_weighted_sum along with the inputs to scale and add; it all works fine.
1.2. Weighted sums are everywhere
Let's look at some examples of weighted sums of numbers:
1.2.1 Indexing
Given an array array = [x,y,z], we can get its i-th element (counting from 0) by array[i]. This can be seen as a weighted sum where the weight corresponding to the i-th element is 1 and all other weights are zero.
array, weights = [2,3,5], [0,0,1]
index = 2
# this is the same as 
# array_at_index = array[index]
array_at_index = get_weighted_sum(
    array, weights, 
    scale_function = scale, 
    add_function = add
)
1.2.2. Expected value of a random variable
You are given a six-sided dice with values values = [1,2,3,4,5,6]. The probabilities of the dice landing on these sides are given by probs = [p1, p2, p3, p4, p5, p6]. Then the expected value of dice (roughly speaking, the mean of the values the dice lands on if we toss is many, many, many times) is given by:
expected_value = get_weighted_sum(
    values, probs, 
    scale_function = scale, 
    add_function = add
)
Now let's look at some examples of weighted sums of arrays of numbers:
1.2.3. Rotation
Any point on a circle of unit radius with center at origin is given by [cos(θ), sin(θ)]. It can be seen as a weighted sum of two arrays [1,0] and [0,1] (what are the corresponding weights?):
import math
v1, v2 = [1,0], [0,1]
angle_in_radians = math.pi / 6
weights = [math.cos(angle_in_radians), math.sin(angle_in_radians)]
get_weighted_sum(
    [v1, v2], weights, 
    scale_function = scale_array, 
    add_function = add_arrays
)
1.2.4. Polynomials
A polynomial itself is nothing but a weighted sum: \( p(x) = a + b.x + c.x^2 + d.x^3 + ... = \text{weighted_sum}([a,b,c,d,...], [1,x,x^2,x^3,...]) \)
def get_polynomial(weights):
    def polynomial(x):
        values = [x ** index for index in range(len(weights))]
        return get_weighted_sum(
            values, weights, 
            scale_function = scale, 
            add_function = add
        )
    return polynomial(x)
But a more interesting thing to note is that two or more polynomials can be scaled and added just like arrays of numbers. Given two polynomials
\(p(x) = p_1 + p_2.x + p_3.x^2 + p_4.x^3 + ... \) and
\( q(x) = q_1 + q_2.x + q_3.x^2 + q_4.x^3 + ... \)
their sum is given by:
\( p(x) + q(x) = (p_1 + q_1) + (p_2 + q_2).x + (p_3 + q_3).x^2 + (p_4 + q_4).x^3 + ... \)
Similarly, a polynomial can be scaled by a weight w like so:
\( w * p(x) = w.a + w.b.x + w.c.x^2 + w.d.x^3 + ... \)
1.3. A new format for weighted sums
Using a good format can greatly affect the way you understand and perceive certain ideas. Using an array of arrays seems a bit cumbersome. It obfuscates some obvious properties of scaling and adding arrays. Let's use a different format to represent an array. We will represent the array [x1,x2,x3] as:
\begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}
The format for scaling [x1,x2,x3] with a weight w is:
\( w . \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} w . x_1 \\ w . x_2 \\ w . x_3 \end{bmatrix} \)
The format for adding two (or more) arrays [x1, x2, x3] and [y1, y2, y3]
\( \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} + \begin{bmatrix} y_1 \\ y_2 \\ y_3 \end{bmatrix} + ... = \begin{bmatrix} x_1 + y_1 + ...\\ x_2 + y_2 + ... \\ x_3 + y_3 + ... \end{bmatrix} \)
1.4. Some observations
First thing to note with this choice of scale and add - each item is itself a weighted mean
\( w_1 . \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} + w_2 . \begin{bmatrix} y_1 \\ y_2 \\ y_3 \end{bmatrix} + w_3... = \begin{bmatrix} w_1.x_1 + w_2.y_1 + w_3...\\ w_1.x_2 + w_2.y_2 + w_3... \\ w_1.x_3 + w_2.y_3 + w_3... \end{bmatrix} \)
This observation allows us to implement get_weighted_sum for an array of arrays in yet another way:
def get_weighted_sum_for_array_of_arrays(array_of_arrays, weights):
    return [
        get_weighted_sum(
            [array[i] for array in array_of_arrays], 
            weights, 
            scale_function=scale, 
            add_function=add
        ) 
        for i in range(len(array_of_arrays[0]))
    ]
This is not only hard to look at, it is also quite slow and inefficient. It is easy to see what we are trying to say using our newly concocted format:
\( \text{weighted_sum}(\begin{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}, \begin{bmatrix} y_1 \\ y_2 \\ y_3 \end{bmatrix} \end{bmatrix}, \begin{bmatrix} w_1 \\ w_2 \end{bmatrix}) = \begin{bmatrix} \text{weighted_sum}([x_1, y_1], [w_1, w_2]) \\ \text{weighted_sum}([x_2, y_2], [w_1, w_2]) \\ \text{weighted_sum}([x_3, y_3], [w_1, w_2]) \end{bmatrix} \)
We can simplify the format even more by stacking all the arrays as columns and removing the text weighted_sum since it is all what we have been talking about.
\( \text{weighted_sum}(\begin{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}, \begin{bmatrix} y_1 \\ y_2 \\ y_3 \end{bmatrix} \end{bmatrix}, \begin{bmatrix} w_1 \\ w_2 \end{bmatrix} = \begin{bmatrix} x_1 & y_1 \\ x_2 & y_2 \\ x_3 & y_3 \end{bmatrix} \begin{bmatrix} w_1 \\ w_2 \end{bmatrix} \)
1.4.1. Who is scaling whom?
Note that an array of numbers values = [x,y,z] scaled by weights = [w1, w2, w3] and then summed up is the same as weights scaled by values and then summed up:
\( weighted\_sum(values,weights) = weighted\_sum(weights,values) \)
Using our new format, we can show this as:
\( \begin{bmatrix} x & y & z \end{bmatrix} \begin{bmatrix} w_1 \\ w_2 \\ w_3 \end{bmatrix} = \begin{bmatrix} w_1 & w_2 & w_3 \end{bmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix} \)
This seemingly innocous equation has some really beautiful interpretations as we will see next. But for now let us try to focus on the symmetry between the roles of values and weights.
1.5. What was this all about? What's next?
We began with taking weighted sums of numbers and then moved on to taking weighted sums of arrays. When we realized the code was getting messier, we came up with a different way to represent weighted sums of arrays. This new way (hopefully) made it easier to look at what was going on. It also enabled us to do some simple algebra on arrays.
But what we did not do was give any special meaning to an array of numbers which was being scaled and summed with another array. For instance the arrays [1,1,1] and [2,2,2] seem quite similar in some way. In fact they are just scaled versions of each other. Similarly [1,2,3] and [0.9, 2, 3.1] seem to be close to each other. This is because the values in the arrays are quite similar. In order to answer such questions, a good way is to visualize them. In other words, we give these arrays some geometry. This geometry when combined with the algebra using the format we came up with gives rise to some good things. Let's jump into the geometry first.