import numpy as np
1 Linear Algebra
The aim of this section is to discuss the basics of matrix, vector and number theory that we need for the later chapters and not introduce the whole of linear algebra. Nevertheless, we will rely on some basic definitions that can be found in any linear algebra book. For notation we refer to Golub and Van Loan (2013).
In Python the module numpy
is used to represent vectors and matrices, we can include it like this (and give it its short hand np
):
1.1 Notation
We will refer to \[ v \in \mathbb{R}^{n} \quad \Leftrightarrow \quad v = \left[ \begin{array}{c} v_1 \\ v_2 \\ v_3 \\ \vdots \\ v_n \end{array} \right], \quad v_i \in \mathbb{R}, \] as a vector \(v\) with \(n\) elements. The set \((\mathbb{R}^n, + ,\cdot)\) forms a so called vector space with the vector addition \(+\) and the scalar multiplication \(\cdot\).
= np.array([1, 2, 3, 4])
v # show the shape
print(f"{v.shape=}")
# access a single element
print(f"{v[0]=}")
# use slicing to access multiple elements
print(f"{v[0:3]=}")
print(f"{v[2:]=}")
print(f"{v[:2]=}")
print(f"{v[0::2]=}")
= 0.5
alpha = alpha * v
w print(f"{w=}")
v.shape=(4,)
v[0]=np.int64(1)
v[0:3]=array([1, 2, 3])
v[2:]=array([3, 4])
v[:2]=array([1, 2])
v[0::2]=array([1, 3])
w=array([0.5, 1. , 1.5, 2. ])
While in math we start indices with 1, Python starts with 0.
From vectors we can move to matrices, where \[ A \in \mathbb{R}^{m\times n} \quad \Leftrightarrow \quad A = (a_{ij}) = \left[ \begin{array}{cccc} a_{11} & a_{12} & \dots & a_{1n} \\ a_{21} & a_{22} & \dots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots\\ a_{m1} & a_{m2} & \dots & a_{mn} \\ \end{array} \right],\quad a_{ij} \in \mathbb{R}, \] is called a \(m \times n\) (\(m\) times \(n\)) matrix. If its values are real numbers we say it is an element of \(\mathbb{R}^{m\times n}\).
We will use capital letters for matrices and small letters for vectors.
= np.array([[1, 2, 3, 4],
A 5, 6, 7, 8],
[9, 10, 11, 12]])
[# show the shape
print(f"{A.shape=}")
# access a single element
print(f"{A[0, 0]=}")
# use slicing to access multiple elements
print(f"{A[0, :]=}")
print(f"{A[:, 2]=}")
A.shape=(3, 4)
A[0, 0]=np.int64(1)
A[0, :]=array([1, 2, 3, 4])
A[:, 2]=array([ 3, 7, 11])
Consequently we can say that a vector is a \(n \times 1\) matrix. It is sometimes also referred to as column vector and its counterpart a \(1 \times n\) matrix as a row vector.
If we want to refer to a row or a column of a matrix \(A\) we will use the following short hands:
- \(A_{i-}\) for row \(i\),
- \(A_{-j}\) for column \(j\).
We can multiply a matrix with a vector, as long as the dimensions fit. Note that usually there is no \(\cdot\) used to indicate multiplication: \[ Av = \left[ \begin{array}{cccc} a_{11} & a_{12} & \dots & a_{1n} \\ a_{21} & a_{22} & \dots & A_{2n} \\ \vdots & \vdots & \ddots & \vdots\\ a_{m1} & a_{m2} & \dots & A_{mn} \\ \end{array} \right] \left[ \begin{array}{c} v_1 \\ v_2 \\ \vdots \\ v_n \end{array} \right] = A_{-1} v_1 + A_{-2} v_2 + \dots + A_{-n} v_n. \] The result is a vector but this time in \(\mathbb{R}^m\).
In Python the *
operator is usually indicating multiplication. Unfortunately, in numpy
it is interpreted as element wise multiplication, so we use @
for multiplications between vector spaces.
= A @ v
w # show the shape
print(f"{w.shape=}")
# show the result
print(f"{w=}")
# Doing the same by hand this is tricky
= np.zeros(A.shape[0])
w_tilde for i, bb in enumerate(v):
+= A[:, i] * bb
w_tilde print(f"{w_tilde=}")
w.shape=(3,)
w=array([ 30, 70, 110])
w_tilde=array([ 30., 70., 110.])
As we can see from the above equation, we can view the matrix \(A\) as a linear mapping or linear function between two vector spaces, namely from \(\mathbb{R}^{n}\) to \(\mathbb{R}^{m}\).
A linear mapping of special interest to us is the transpose of a matrix defined by turning rows into columns and vice versa: \[ C = A^{\mathsf{T}}, \quad \Rightarrow \quad c_{ij} = a_{ji}. \] Consequently, the transpose of a (row) vector is a column vector.
print(f"{A=}")
print(f"{A.shape=}")
= A.transpose()
B print(f"{B=}")
print(f"{B.shape=}")
A=array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
A.shape=(3, 4)
B=array([[ 1, 5, 9],
[ 2, 6, 10],
[ 3, 7, 11],
[ 4, 8, 12]])
B.shape=(4, 3)
With this operation we can define two more mappings.
= np.array([1, 2, 3, 4])
v = np.array([1, 1, 1, 1])
w # alternatively we can define w with
= np.ones(v.shape)
w = np.vdot(v, w)
alpha print(f"{alpha=}")
alpha=np.float64(10.0)
As \(\mathbb{R}^n\) is an euclidean vector space the above function is also called the inner product.
= np.outer(v, w)
C print(f"{C=}")
C=array([[1., 1., 1., 1.],
[2., 2., 2., 2.],
[3., 3., 3., 3.],
[4., 4., 4., 4.]])
We can also multiply matrices \(A\) and \(B\) by applying the matrix vector multiplication to each column vector of \(B\), or a bit more elaborated:
For a \({m \times p}\) matrix \(A\) and a \({p \times n}\) matrix \(B\) the matrix-matrix multiplication (\(\mathbb{R}^{m\times p} \times \mathbb{R}^{p\times n} \to \mathbb{R}^{m\times n}\)) \[C=AB \quad \Rightarrow\quad c_{ij} = \sum_{k=1}^p a_{ik}b_{kj}\] forms a \({m \times n}\) matrix.
= A @ A.transpose()
C print(f"{C=}")
= A.transpose() @ A
D print(f"{D=}")
C=array([[ 30, 70, 110],
[ 70, 174, 278],
[110, 278, 446]])
D=array([[107, 122, 137, 152],
[122, 140, 158, 176],
[137, 158, 179, 200],
[152, 176, 200, 224]])
From the above Python snippet we can easily see that matrix-matrix multiplication is not commutative.
1.2 Norms
There are multiple norms that can be useful for vectors. The most common are:
- the one norm \[ \| v \|_1 = \sum_i |v_i|,\]
- the two norm (euclidean norm) \[ \| w \| = \| v \|_2 = \sqrt{\sum_i |x_i|^2} = \sqrt{\langle v, v \rangle},\]
- more general the \(p\)-norms (for \(1\leq p \le \infty\)) \[ \| v \|_p = \left(\sum_i |v_i|^p\right)^{\frac{1}{p}},\]
- the \(\infty\) norm \[ \| v \|_\infty = \max_i |v_i|.\]
And for matrices:
- the one norm (column sum norm) \[ \| A \|_1 = \max_j \sum_i |a_{ij}|,\]
- the Frobeniusnorm \[ \| A \| = \| A \|_F = \sqrt{\sum_i \sum_j |a_{ij}|^2},\]
- the \(p\) norms are defined \[ \| A \|_p = \left(\sum_i \sum_j |a_{ij}|^p\right)^{\frac1p},\]
- the \(\infty\) norm (row sum norm) \[ \| A \|_1 = \max_i \sum_j |a_{ij}|.\]
# The norms can be found in the linalg package of numpy
from numpy import linalg as LA
= LA.norm(v)
norm_v print(f"{norm_v=}")
= LA.norm(v, 2)
norm_v2 print(f"{norm_v2=}")
= LA.norm(A, 1)
norm_A print(f"{norm_A=}")
= LA.norm(A, "fro")
norm_Afr print(f"{norm_Afr=}")
norm_v=np.float64(5.477225575051661)
norm_v2=np.float64(5.477225575051661)
norm_A=np.float64(24.0)
norm_Afr=np.float64(25.495097567963924)
The function norm
from the numpy.linalg
package can be used to compute other norms or properties as well, see docs.
1.3 Basis of vector spaces
As we will be using the notion of basis vector or a basis of a vector space we should introduce them properly.
The standard basis of \(\mathbb{R}^n\) are the vectors \(e_i\) that are zero everywhere except for index \(i\) and its associated matrix is \[ I_n = \left[ \begin{array}{cccc} 1 & 0 & \dots & 0\\ 0 & 1 & \ddots & \vdots \\ \vdots & \ddots & 1 & 0\\ 0 & \dots & 0 & 1 \\ \end{array} \right] \in \mathbb{R}^{n \times n}, \] and called the identity matrix. Note, the index \(n\) is often omitted as it should be clear from the dimensions of the matrix.
The easiest way to create one of standard basis vectors, lets say \(e_3 \in \mathbb{R}^3\), in Python is by calling
# We need to keep the index shift in mind
= 3
n = np.zeros(n)
e_3 3-1] = 1
e_3[print(f"{e_3=}")
e_3=array([0., 0., 1.])
and the identity matrix by
= 4
n = np.eye(n)
I_4 print(f"{I_4=}")
I_4=array([[1., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.]])
1.4 The inverse of a matrix
The following holds true for the inverse of matrices:
- the inverse of a product is the product of the inverses \[ (AB)^{-1} = B^{-1}A^{-1},\]
- the inverse of the transpose is the transpose of the inverse \[ (A^{-1})^{\mathsf{T}} = (A^{\mathsf{T}})^{-1} \equiv A^{-\mathsf{T}}.\]
= np.random.rand(3, 3)
A print(f"{A=}")
= LA.inv(A)
X print(f"{X=}")
print(f"{A @ X=}")
print(f"{np.allclose(A @ X, np.eye(A.shape[0]))=}")
# Note that the equality is hard to achieve for floats
@ X, np.eye(A.shape[0]), verbose=False) np.testing.assert_equal(A
A=array([[0.14428232, 0.3535935 , 0.59134082],
[0.59725813, 0.84366491, 0.00543072],
[0.3461897 , 0.54736427, 0.27627259]])
X=array([[-59.43477 , -58.37106377, 128.3630867 ],
[ 42.13376962, 42.5803167 , -91.02118882],
[ -9.00130829, -11.21892982, 23.10677243]])
A @ X=array([[ 1.00000000e+00, 7.18868957e-16, -6.80126911e-15],
[ 2.20779142e-15, 1.00000000e+00, -9.87048987e-15],
[ 5.22083444e-16, -2.58630720e-15, 1.00000000e+00]])
np.allclose(A @ X, np.eye(A.shape[0]))=True
--------------------------------------------------------------------------- AssertionError Traceback (most recent call last) Cell In[13], line 8 6 print(f"{np.allclose(A @ X, np.eye(A.shape[0]))=}") 7 # Note that the equality is hard to achieve for floats ----> 8 np.testing.assert_equal(A @ X, np.eye(A.shape[0]), verbose=False) [... skipping hidden 1 frame] File ~/work/MECH-M-DUAL-1-DBM/MECH-M-DUAL-1-DBM/.venv/lib/python3.12/site-packages/numpy/_utils/__init__.py:85, in _rename_parameter.<locals>.decorator.<locals>.wrapper(*args, **kwargs) 83 raise TypeError(msg) 84 kwargs[new_name] = kwargs.pop(old_name) ---> 85 return fun(*args, **kwargs) [... skipping hidden 1 frame] File /opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/contextlib.py:81, in ContextDecorator.__call__.<locals>.inner(*args, **kwds) 78 @wraps(func) 79 def inner(*args, **kwds): 80 with self._recreate_cm(): ---> 81 return func(*args, **kwds) File ~/work/MECH-M-DUAL-1-DBM/MECH-M-DUAL-1-DBM/.venv/lib/python3.12/site-packages/numpy/testing/_private/utils.py:889, in assert_array_compare(comparison, x, y, err_msg, verbose, header, precision, equal_nan, equal_inf, strict, names) 884 err_msg += '\n' + '\n'.join(remarks) 885 msg = build_err_msg([ox, oy], err_msg, 886 verbose=verbose, header=header, 887 names=names, 888 precision=precision) --> 889 raise AssertionError(msg) 890 except ValueError: 891 import traceback AssertionError: Arrays are not equal Mismatched elements: 9 / 9 (100%) Max absolute difference among violations: 9.87048987e-15 Max relative difference among violations: 4.6629367e-15
There are special basis vectors, respectively matrices that allow for easy computation of the inverse.
Extending this to a matrix (and to that end a basis) as follows.
For now, this concludes our introduction to linear algebra. We will come back to more in later sections.