Linear Algebra Notes
Notes on some linear algebra topics. Sometimes I find things are just stated
as being obvious, for example, the dot product of two orthogonal vectors
is zero
. Well... why is this? The result was a pretty simple but
nonetheless it took a little while to think it through properly.
Hence, these notes... they started of as my musings on the "why", even if the
"why" might be obvious to the rest of the world!
A lot of these notes are now just based on the amazing Khan Academy linear algebra tutorials. In fact, id say the vast majority are... they are literally just notes on Sal's lectures!
Another resource that is really worth watching is 3 Blue 1 Brown's series "The essence of linear algebra".
Page Contents
Vector Stuff
Vector Spaces
References:
- What isn't a vector space?, Mathematics Stack Exchange.
- Things that are/aren't vector spaces, Dept. of Mathematics and Statistics,University of New Mexico.
- Vector Spaces, physics.miami.edu.
- Vector space, Wikipedia.
A vector is an element of a vector space. A vector space, also known as a
linear space,
1 | Closed under addition: | |
2 | Communative: | |
3 | Associative: | |
4 | Additive identity: | |
5 | Inverse: | |
6 | Closed under scalar multiply: | |
7 | Distributive: | |
8 | Distributive: | |
9 | Distributive: | |
10 | Multiply identity: |
Functions Spaces
First let's properly define a function. If
One example of a vector space is the set of real-valued functions of a real variable, defined
on the domain
We've said that a vector is any element in a vector space, but usually we thing of this in
terms of, say,
What Isn't A Vector Space
It sounded a lot like everything was a vector space to me! So I did a quick bit of googling and found the paper "Things that are/aren't vector spaces". It does a really good example of showing how in maths we might build up a model of something and expect it to behave "normally" (i.e., obeys the vector space axioms), but in fact doesn't, in which case it becomes a lot harder to reason about them. This is why vectors spaces are nice... we can reason about them using logic and operations we are very familiar with and feel intuitive.
Linear Combinations
A linear combination of vectors is one that only uses the addition and scalar multiplication
of those variables. So, given two vectors,
Spans
The set of all linear combinations of a set of vectors
In the graph to the right there are two vectors
From this we may be tempted to say that any set of two, 2d, vectors spans
Lets write the above out a little more thoroughly...
Linear Independence
We saw above that the set of vectors
For example, the set of vectors
A set of vectors will be said to be linearly dependent when:

Think of vectors in a
This shows that neither vector is a linear combination of the other, because if it where, we could use a non-zero scaling of one, added to the other, to get us back to the origin.
We can also show that two vectors are linearly independent when their dot product is zero. This is covered in a latter section.
Another way of thinking about LI is that there is only one solution to
the equation
Linear Subspace
If
contains the zero vector: is closed under scalar multiplication: is closed under addition:
What these rules mean is that, using linear operations, you can't do anything to "break" out of the subspace using vectors from that subspace. Hence "closed".
Surprising, at least for me, was that the zero vector is actually a subspace of
A very useful thing is that a span is always a valid subspace. Take, for example,
First, does it contain the zero vector? Recall that a span is the set of all linear combinations
of the span vectors. So we know that the following linear combination is valid, and hence the
span contains the zero vector.
Third, is it closed under vector addition? Well, yes, the same argument as made above would apply.
Interestingly, using the same method as above, we can see that even something trivial,
like the
It can also be shown that any subspace basis will have the same number of elements. We say that the dimension of a subspace is the number of elements in a basis for the subspace. So, for example, if we had a basis for our subspace consisting of 13 vectors, the dimension of the subspace would be lucky 13.
Basis Of A Subspace
Recall,
If we choose a subset of
Put more formally: Let
- The vectors
span , and - They are linearly independent.
A subspace can have an infinite number of basis, but there are standard basis. For
example in
Thus, we can say that every subspace
A basis is a good thing because we can break down any element in the space that the basis defines into a linear combination of basic building blocks, i.e. the basis vectors.
To find a basis use the following method:
- Form a matrix where the jth column vector is the jth vector
. - Reduce the matrix to row-echelon form.
- Tak the column vectors that contain a pivot and these are the basis vectors.
TODO discuss the canonical vectors in R2 and R3.
TODO unique coefficients. implied by the linear independence
TODO most important basis are orthogonal and orthonormal.
Orthonormal are good because we can find the coefficient in the new basis easily:
The Dot Product
Definition
This is a good reference.
The dot product is one definition for a vector multiplication that results in a scalar.
Given two equal-length vectors, the dot product looks like this:
Properties
The operation has the following properties:
Commutative: | |
Distributive: | |
Associative | |
Cauchy-Schwartz Inequality | |
Vector Triangle Inequality |
Angle Between Vectors
The dot product also defines the angle between vectors. Take two unit vectors
Vector Length
Vector Transposes And The Dot Product
In linear algebra vectors are implicitly column vectors. I.e., when we talk about
Dot Product With Matrix Vector Product
If
Then, if
Unit Length Vectors
A unit vector is a vector that has a length of 1. If
Interesting we can also write
To construct a unit vector where
Orthogonal Vectors and Linear Independence

Figure 1
Orthogonal vectors are vectors that are perpendicular to each other.
In a standard 2D or 3D graph, this means that
they are at right angles to each other and we can visualise them
as seen in Figure 1:
In any set of any vectors,
So, how do we tell if one vector is orthogonal to another? The answer is
the dot product which is defined as follows.
We know when two vectors are orthogonal when their dot product is
zero:
But why is this the case? Lets imagine any two arbitrary vectors, each on the circumference of a unit circle (so we know that they have the same length and are therefore a proper rotation of a vector around the center of the circle). This is shown in figure 2. From the figure, we know the following:

Figure 2
Another way of looking at this is to consider the law of cosines:
Here
I.e.,
When the angle between the vectors is 90 degrees,
Based, on this we can say that...
Matrices
Properties
Khan Academy has great articles. This is another good reference from MIT.
Scalar Multiplication
- Associative:
- Distributive:
and - Identity:
- Zero:
- Closure:
is a matrix of the same dimensions as .
Note that scalar multiplication is not commutative:
Matrix Addition
- Commutative:
- Associative:
- Zero: For any matrix
, there is a unique matrix such that - Inverse:
- Closure:
, is a matrix of the same dimensions as and .
Matrix Multiplication
- Associative:
- Distributive:
- Identity:
- Zero:
- Inverse: Only if
is a square matrix, then is its inverse and . Note that not all matrices are invertible. - Dimension: If
is an matrix and is an matrix then is an matrix.
Note that matrix multiplication is not commutative:
Be careful: the equation
Matrix Transpose
- Associative:
- Distributive (into addition):
- Distributive (into mult):
Generalises to - Identity:
- Inverse:
- Name?:
- Name?:
- Name?:
- If columns of
are LI then columns of are LI. is square and also invertible.
Matrix Inverse
- Associative (self with inverse):
- Distributive (into scalar mult):
- Distributive (into mult):
- Identity:
- Name?:
- Name?:
Matricies, Simultaneous Linear Equations And Reduced Row Echelon Form
References:
- Lecture 18: The Rank of a Matrix and Consistency of Linear Systems, Ohio University.
- Types of Solution Sets.
- Row Echelon Form and Number of Solutions.
The set of linear equations:
- Row interchange:
- Row scaling:
- Row addition:
In row echelon form:
- All rows containing only zeros appear below rows with nonzero entries, and
- The first nonzero entry in any row appears in a column to the right of the first nonzero entry in any preceding row.
The first non zero entry in a row is called the pivot for that row.
And woop, woop, even more important than echelon form is reduced row-echelon form (rref). This is when:
- If a row has all zeros then it is below all rows with a non-zero entry.
- The left most entry in each row is 1 - it is called the pivot.
- The left most non-zero entry in each row is the only non-zero in that column.
- Each row has more zeros to the left of the pivot than the row above.
Or, to put it another way, when it is a matrix in row-echelon form where:
- All pivots are equal to 1, and
- Each pivot is the only non-zero in its column.
Some important terms are pivot position and pivot column. The position is the location of a leading 1, which meets the above criteria, and the column is a column that contains a pivot position.
So why do we give a damn above RREF if we have row-echelon form? Well, one reason is it lets us figure out if a set of linear equations has a solution and which variables are free and which are dependent. The reason for this is that the RREF of every matrix is unique (whereas merely row reducing a matrix can have many resulting forms).
The free variable is one that you have control over, i.e. you can choose and manipulate its value. The dependent one depends on the values of the other variables in the experiment or is constant.
Pivot columns represent variables that are dependent. Why?
Recall that the column vector containing a pivot is all-zeros except
for the pivot. So if element j in a row of a rref matrix is a pivot, then in our
system of equations
Because column vectors containing a pivot are all-zeros except for the pivot, the set of pivot columns must also be LI (see later). You can always represent the free columns as linear combinations of the pivot columns.
If we have an augmented matrix
- No Solution. If the last (augmented) column of
(ie. the column which contains ) contains a pivot entry. - Restructed Solution Set. If the last row of the matrix is all zeros except the pivot column which is a function of
. - Exactly One Solution. If the last (augmented) column of
(ie. the column which contains ) does not contain a pivot entry and all of the columns of the coefficient part (ie. the columns of ) do contain a pivot entry. - Infinitely Many Solutions. If the last (augmented) column of
(ie. the column which contains ) does not contain a pivot entry and at least one of the columns of the coefficient part (ie. the columns of ) also does not contain a pivot entry.
Let's look at a quick example of a matrix in echelon-form:
We can get the solution to the set of simultaneous linear equations using back substitution. Directly
from the above row-echelon form (useful because the last row has one pivot and no free variables) matrix we know that:
We can use RREF to accomplish the same thing and find the solutions to the above.
We can not talk about the general solution vector,
Lets, therefore, look at something more interesting. Take the RREF matrix below:
The above can then be represented using the general solution vector:
We can find the reduced row echelon form of a matrix in python using the sympy
package and the rref()
function:
from sympy import Matrix, init_printing init_printing(use_unicode=True) system = Matrix(( (3, 6, -2, 11), (1, 2, 1, 32), (1, -1, 1, 1) )) system system.rref()[0]
Which outputs:
⎡1 0 0 -17/3⎤ ⎢ ⎥ ⎢0 1 0 31/3 ⎥ ⎢ ⎥ ⎣0 0 1 17 ⎦
To then apply this to our augmented matrix we could do something like the following. Lets
say we have a set of 3 simultaneous equations with 3 unknowns:
from sympy import Matrix, solve_linear_system, init_printing from sympy.abc import x, y, z init_printing(use_unicode=True) system = Matrix(( (3, 6, -2, 11), (1, 2, 1, 32), (1, -1, 1, 1) )) system solve_linear_system(system, x, y, z)
Which will give the following output:
⎡3 6 -2 11⎤ ⎢ ⎥ ⎢1 2 1 32⎥ ⎢ ⎥ ⎣1 -1 1 1 ⎦ {x: -17/3, y: 31/3, z: 17}
Matrix / Vector Multiplication
Matrix/vector multiplication is pretty straight forward:
The concept of matrix multiplication, the type I learnt in school, is shown in the little animation below. For each row vector of the LHS matrix we take the dot product of that vector with each column vector from the RHS matrix to produce the result:
But having said that, I'd never really thought of it as shown below until watching some stuff on Khan Academy:
We can think of it as the linear combination of the column vectors of the matrix where the coefficients are the members of the vector, where the first column vector is multiplied by the first element, the second column vector by the second element and so on.
So, for any mxn matrix,
Null Space Of A Matrix
Definition
For an nxm vector
The system of linear equations
The null space of a matrix is a valid subspace of
contains the zero vector: is closed under scalar multiplication: is closed under addition:
It can also be shown that
Relation To L.I. Of Column Vectors
The nullspace of a matrix can be related to the linear independence of it's column
vectors. Using the above definition of nullspace, we can write:
Relation To Determinant
If the determinant of a matrix is zero then the null space will be non trivial.
Calculating The Null Space
To find the null space of a matrix
Example #1
Let's look at a trivial example:
Example #2
Lets try an consider another trivial example.
Example #3
Let's try one more example. This one is right out of Khan Achademy rather then my own made up rubbish :)
An Intuitive Meaning
This StackOverflow thread gives some really good intuitive meaning to the null space. I've quoted some of the answers verbatim here:
It's good to think of the matrix as a linear transformation; if you let
, then the null-space is ... the set of all vectors that are sent to the zero vector by . Think of this as the set of vectors thatlose their identity as is applied to them.
Let's suppose that the matrix A represents a physical system. As an example, let's assume our system is a rocket, and A is a matrix representing the directions we can go based on our thrusters. So what do the null space and the column space represent?
Well let's suppose we have a direction that we're interested in. Is it in our column space? If so, then we can move in that direction. The column space is the set of directions that we can achieve based on our thrusters. Let's suppose that we have three thrusters equally spaced around our rocket. If they're all perfectly functional then we can move in any direction. In this case our column space is the entire range. But what happens when a thruster breaks? Now we've only got two thrusters. Our linear system will have changed (the matrix A will be different), and our column space will be reduced.
What's the null space? The null space are the set of thruster intructions that completely waste fuel. They're the set of instructions where our thrusters will thrust, but the direction will not be changed at all.
Another example: Perhaps A can represent a rate of return on investments. The range are all the rates of return that are achievable. The null space are all the investments that can be made that wouldn't change the rate of return at all.
Nullity Of A Matrix
The dimension of a subspace is the number of elements in a basis for that subspace. The dimension of the null space is called the "nullity":
In the null space you will always have as many vectors as you have free variables in the RREF of the matrix. So the nullity is the number of free variables in the RREF of the matrix with which this null space is associated:
Column Space Of A Matrix
The columns space of a matrix is the set of all linear combinations
of its column vectors, which is the span of the column vectors:
For the system of equations
Therefore, if
The set of vectors in the column space of may or may not be linearly independent. When reduced to being LI then we have a basis for the column space of the matrix.
We saw the following in the section of null spaces: It can be shown that the column vectors of a matrix
How do we figure out the set of column vectors that are LI? To do this,
put the matrix in RREF and select the columns that have pivot entries (the pivot columns).
The pivot column vectors are the basis vectors for
Because column vectors containing a pivot are all-zeros except for the pivot, the set of pivot columns must also be LI. You can always represent the free columns as linear combinations of the pivot columns.
So to find column space of
A really interested property of the column space is the column space criterion which states that a linear system,
Rank Of A Matrix
We can now define what the rank of a matrix is:
The rank equation is
If
is the number of free variables in solution space of , which equals the number of pivot-free columns in . is the number of pivots in . , equals the number of columns of .
TODO: https://math.stackexchange.com/questions/21100/importance-of-matrix-rank
Basis Of The Column Space VS Basis Of The Null Space
Really good answer by John-Paul Meyer on the Khan Academy website:
... a "basis of a column space" is different than a "basis of the null space", for the same matrix....
A basis is a a set of vectors related to a particular mathematical "space" (specifically, to what is known as a vector space). A basis must:
- be linearly independent and
- span the space.
The "space" could be, for example:
, , , ... , the column space of a matrix, the null space of a matrix, a subspace of any of these, and so on...
Orthogonal Matrices
So, onto orthogonal matrices. A matrix is orthogonal if
The pattern
The other part of the identity matrix would imply that we would have
to have
Does the same hold true for
So we can say that if we have matrix who's rows are orthogonal vectors and who's columns are also orthogonal vectors, then we have an orthogonal matrix!
Okay, thats great n' all, but why should we care? Why Are orthogonal matricies useful?. It turns our that orthogonal matricies preserve angles and lengths of vectors. This can be useful in graphics to rotate vectors but keep the shape they construct, or in numerical analysis because they do not amplify errors.
Determinants
The Fundamentals
If
A matrix is invertible iff
If
We've used the first row (
Select the row or column with the most zeros and go through it. Use each element
This generalises to an nxn matrix as follows:
Then we can define the determinant recursively as follows:
There is a pattern to the "+"'s and "-"'s:
A quick example.
Let
Let
Now we can write: