Vector Spaces#

Caution

This section is a little more abstract than those that preceded it, and is filled with definitions that need to be read very carefully.

Definition: Vector Space

A vector space \(V\) is a set satisfying the condition that if \(\mathbf{u}, \mathbf{v} \in V\), then any linear combination \(c_1\mathbf{u} + c_2\mathbf{v} \in V\) also. As usual, here \(c_1\) and \(c_2\) represent scalars that for this course are assumed to be real numbers.

In the above definition, note that the elements of a vector space \(V\) don’t necessarily have to be Euclidean vectors; that is, \(n\times 1\) arrays of real numbers. Spaces of Euclidean vectors will be the focus in this book, but there are many mathematical objects that satisfy this definition and lead to interesting vector spaces.

Definition: Euclidean Vector Space

Given a positive integer \(n\), we use \(\mathbb{R}^n\) to denote the vector space consisting of Euclidean vectors wisth \(n\) components. This space is called \(n\)-dimensional Euclidean space, but we usually just say ‘R-N’.

Example:

\[ \mathbb{R}^2 = \{ \mathbf{v} | \mathbf{v} = [v_1\hspace{1em}v_2]^T, v_1, v_2\text{ are real numbers }\}. \]

This is the vector space corresponding to the \(x-y\) plane.

Example: \(\mathbb{R} = \mathbb{R}^1\) is the vector space consisting of vectors with a single real-valued component. This space is indistiguishable from the set of real numbers aside from whether we write brackets around the number or not. Usually we ignore the brackets and just think of \(\mathbb{R}\) as the set of real numbers.

The definition of vector space has a couple of important implications:

  • Most vector spaces will contain infinitely many vectors, because we can form infinitely many linear combinations of any vectors they contain.

  • The zero vector \(\mathbf{0}\) is contained in every vector space, because given any vector \(\mathbf{v}, 0\cdot\mathbf{v} = \mathbf{0}\) produces the zero vector as a scalar multiple or ‘trivial’ linear combination of \(\mathbf{v}\).

Furthermore, many vector spaces exist that are not Euclidean; in fact, much of mathematics is done in vector spaces of one sort or another.

Example: Let \(\mathcal{F} = \{f:\mathbb{R}\to\mathbb{R}\}\) (that is, the set of all real-valued functions). This is a vector space, because given any two real-valued functions \(f(x), g(x) \in \mathcal{F}\) and scalars \(c_1, c_2\), the linear combination \(c_1f(x) + c_2g(x)\) is also a real-valued function and thus a member of \(\mathcal{F}\). To take a specific example, \(f(x) = e^x\) and \(g(x)=x^2\) are in \(\mathcal{F}\), and any arbitrary linear combination of them such as \(2e^x - 7x^2\) is still a real-valued function and thus also a member of \(\mathcal{F}\).

Definition: Subspace

\(S\) is a subspace of a vector space \(V\) if \(S\) is a subset of \(V\) and if \(\mathbf{s}_1, \mathbf{s}_2 \in S\), then any linear combination \(c_1\mathbf{s}_1 + c_2\mathbf{s}_2 \in V\) also.

Example, continued: Let \(\mathcal{P}_2 = \{p:\mathbb{R}\to\mathbb{R} | p\text{ is a polynomial of degree at most 2}\}\). This set is a subspace of the vector space \(\mathcal{F}\) from the previous example. (Verify that \(\mathcal{P}_2\) is closed under linear combination as an exercise!)

Example: Let \(P = \{\mathbf{v} \in \mathbb{R}^3| v_3 = 0\}\). Then \(P\) is a subspace of \(\mathbb{R}^3\): if we take any two vectors \(\mathbf{u}, \mathbf{v} \in P\) and arbitrary scalars \(c_1, c_2\), The linear combination

\[\begin{split} c_1\begin{bmatrix} u_1 \\ u_2 \\ 0 \end{bmatrix} + c_2\begin{bmatrix} v_1 \\ v_2 \\ 0 \end{bmatrix} = \begin{bmatrix} c_1u_1 + c_2v_1 \\ c_1u_2 + c_2v_2 \\ 0 \end{bmatrix} \end{split}\]

has third component equal to zero and thus meets the requirement to be a member of \(P\). Thus, \(P\) is closed under linear combination and is a vector subspace of \(\mathbb{R}^3\); in fact \(P\) is the \(x-y\) plane, viewed now as a subspace of \(\mathbb{R}^3\).

Example: Let \(S =\{ \mathbf{v}\in\mathbb{R}^3| v_3 = 1\}\). Although this set is a subset of \(\mathbb{R}^3\), it is not a vector space: take any two vectors \(\mathbf{u}, \mathbf{v} \in S\). Their sum will be

\[\begin{split} \begin{bmatrix} u_1 \\ u_2 \\ 1 \end{bmatrix} + \begin{bmatrix} v_1 \\ v_2 \\ 1 \end{bmatrix} = \begin{bmatrix} u_1 + v_1 \\ u_2 + v_2 \\ 2 \end{bmatrix}, \end{split}\]

and the third component of this vector is not \(1\), so it doesn’t satisfy the condition it needs to in order to be a member of \(S\). Thus \(S\) is not closed under linear combination.

Spans and Bases#

The examples given above define vector spaces as sets on which some constraints are imposed. This is fine, but defining a vector space this way (a) require us to verify somewhat tediously that the set does in fact meet the requirements to be a vector space, and (b) doesn’t provide much insight into what the vector space really looks like. Another way to specify a vector space is by defining it as the span of a set of vectors: given vectors \(\mathbf{v}_1,\dots,\mathbf{v}_k\), the set \(V=span(\mathbf{v}_1,\dots,\mathbf{v}_k)\) is by definition the set of all possible linear combinations of \(\mathbf{v}_1,\dots,\mathbf{v}_k\), and therefore is necessarily closed under linear combination.

Example: Let \(V = span([1\hspace{1em} 0\hspace{1em} 0]^T, [0\hspace{1em} 1\hspace{1em} 0]^T)\). Then \(V \subseteq \mathbb{R}^3\), but \(V\neq \mathbb{R}^3\) because no linear combination of the vectors spanning \(V\) have a nonzero third component, so their linear combinations cannot have a nonzero third component either. In fact, \(V\) corresponds to the \(x-y\) plane, a subspace of \(\mathbb{R}^3\) that is equivalent (‘isomorphic’) to \(\mathbb{R}^2\).

Using spans to specify a vector space is an improvement over imposing conditions on a set because there is no need to verify that the span forms a vector space, but it still isn’t ideal because a spanning set can contain redundancies.

Example, continued: The set

\[\begin{split} \begin{bmatrix} 1 \\ -1 \\ 0 \end{bmatrix}, \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 2/3 \\ 0 \\ 0 \end{bmatrix} \end{split}\]

spans the same subspace of \(\mathbb{R}^3\) as \(V\) in the previous example (the \(x-y\) plane). To see that this is the case, note that

\[\begin{split} \dfrac{1}{2} \begin{bmatrix} 1 \\ -1 \\ 0 \end{bmatrix} + \dfrac{1}{2} \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} \end{split}\]

and

\[\begin{split} -\dfrac{1}{2} \begin{bmatrix} 1 \\ -1 \\ 0 \end{bmatrix} + \dfrac{1}{2} \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}, \end{split}\]

so the vectors spanning \(V\) are linear combinations of these vectors. Conversely, we can easily produce each of these three vectors as a linear combination of the vectors spanning \(V\) (verify this!), so it must be the case that the span of these three vectors equal \(V\) also. But then we have two different sets of vectors that span \(V\), one set consisting of two vectors and a second set consisting of three vectors. The latter contains a redundancy: the three vectors given are linearly dependent.

\[\begin{split} \dfrac{1}{3} \begin{bmatrix} 1 \\ -1 \\ 0 \end{bmatrix} + \dfrac{1}{3} \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 2/3 \\ 0 \\ 0 \end{bmatrix}. \end{split}\]

Definition: Basis

A basis for a vector space is a linearly independent set of vectors that span the space.

A basis is in a sense a minimal spanning set: remove a vector and you no longer span the space, but add a vector and you no longer have linear independence. In the preceding example, the first spanning set given is a basis, while the second is not because the three vectors are not linearly independent.

We say a basis for a vector space rather than the basis because bases are not unique. For example, both

\[\begin{split} \{ \begin{bmatrix} 1 \\ 1 \end{bmatrix}, \begin{bmatrix} -1 \\ 1 \end{bmatrix} \} \end{split}\]

and

\[\begin{split} \{ \begin{bmatrix} 2 \\ 3 \end{bmatrix}, \begin{bmatrix} 4 \\ 2 \end{bmatrix} \} \end{split}\]

are bases for \(\mathbb{R}^2\). While bases are not unique, the number of vectors in any basis is unique.

The proof of this relies on the Steinitz Exchange Lemma.

The Steinitz Exchange Lemma

If \(\{\mathbf{v}_1, \dots, \mathbf{v}_k\}\) are linearly independent vectors and \(\{\mathbf{w}_1, \dots, \mathbf{w}_m\}\) spans a vector space \(V\), then \(k \leq m\).

Proof: Since \(\{\mathbf{w}_1, \dots, \mathbf{w}_m\}\) spans \(V\), we can write \(\mathbf{v}_1\) as a linear combination of the \(\mathbf{w}\)’s. At least one coefficient must be nonzero (say for \(\mathbf{w}_1\)), so we can solve for \(\mathbf{w}_1\) in terms of \(\mathbf{v}_1\) and the other \(\mathbf{w}\)’s. This means \(\{\mathbf{v}_1, \mathbf{w}_2, \dots, \mathbf{w}_m\}\) still spans \(V\). Repeating this process for \(\mathbf{v}_2, \dots, \mathbf{v}_k\), we can replace \(k\) of the \(\mathbf{w}\)’s with the \(\mathbf{v}\)’s while maintaining a spanning set. This is only possible if \(k \leq m\). \(\blacksquare\)

Theorem: Invariant Basis Number

Any two bases of a finite-dimensional vector space have the same cardinality.

(Cardinality is a term for the number of elements in a set).

Proof: Suppose that \(A = \{\mathbf{a}_1, \dots, \mathbf{a}_n\}\) and \(B=\{\mathbf{b}_1, \dots, \mathbf{b}_m\}\) are both bases for a vector space \(V\). Because \(A\) is a basis, its vectors are linearly independent, and because \(B\) is a basis, its vectors span \(V\), therefore according to the Steinitz Exchange Lemma, \(n \leq m\). Conversely, because \(B\) is a basis, its vectors are linearly independent, and because \(A\) is a basis, its vectors span \(V\), therefore according to the Steinitz Exchange Lemma, \(m \leq n\), and thus \(m=n\). \(\blacksquare\)

Definition: Dimension

The dimension of a vector space \(V\) is written \(dim(V)\) and is the number of vectors in a basis for \(V\).

Note: This definition only makes sense because the previous theorem guarantees that the cardinality of any basis for \(V\) is the same as the cardinality of any other basis for \(V\).

Determining if a Set is a Basis#

It is often the case that we need to determine if a given set is a basis for a vector space. If the set is not linearly independent, we are done: a set cannot be a basis if it is linearly dependent. If a set does not span the space, we are also done: a set cannot be a basis for a vector space unless it spans the space. Unfortunately, whether a set is linearly dependent or not and whether a set spans a space or not may not be obvious. The easiest way to determine the answers to these questions is to turn them into questions about linear systems, because linear combinations always lead to linear systems thanks to our definition of matrix-vector multiplication.

Let’s start with linear independence. Suppose that I am given some vectors \(A_1,\dots, A_n\) and asked to determine if they are a basis for some vector space. These vectors will be linearly dependent if there exists scalars \(x_1,\dots,x_n\) not all equal to \(0\) such that the linear combination \(x_1A_1 + \cdots + x_nA_n = 0\), but this is equivalent to the matrix equation \(A\mathbf{x}=\mathbf{0}\), where

\[ A = \begin{bmatrix} A_1 & \cdots & A_n \end{bmatrix} \]

is the matrix with \(i^{th}\) column \(A_i\). If this equation has a unique solution other than \(\mathbf{x} = \mathbf{0}\), then the \(A_i\)’s are linearly dependent and the set cannot form a basis.

Now let’s turn to span, and suppose again that I am given some vectors \(A_1,\dots, A_n\) and asked to determine if they are a basis for some vector space. These vectors will span the vector space if given any vector \(\mathbf{b}\) in the space, I can produce scalars \(x_1,\dots,x_n\) to realize \(\mathbf{b}\) as a linear combination of the \(A_i\)’s; that is, such that \(x_1A_1 + \cdots + x_nA_n = \mathbf{b}\). But this is equivalent to being able to solve the matrix equation \(A\mathbf{x} = \mathbf{b}\) for any vector \(\mathbf{b}\) in the vector space.

These two questions - whether \(A\mathbf{x} = \mathbf{0}\) has a unique solution and whether \(A\mathbf{x} = \mathbf{b}\) has a solution for any vector \(\mathbf{b}\) in some vector space - can be viewed as questions about the matrix itself. If \(A\) has the right number of pivots (\(n\) exactly, one in each column and row), then for any \(\mathbf{b}\), including \(\mathbf{b}=\mathbf{0}\), the equation \(A\mathbf{x} = \mathbf{b}\) has a unique solution that we can obtain by augmenting \(A\) with \(\mathbf{b}\) and chasing through Gaussian elimination. In the case where \(\mathbf{b} = \mathbf{0}\), the unique solution will be \(\mathbf{0}\) itself, and that proves linear independence in the columns of \(A\). This implies, then, that the \(A_i\)’s are a basis for the vector space containing the \(\mathbf{b}\)’s.

Having exactly one pivot in each column and row of a matrix constrains the matrix’s shape: it cannot have any ‘extra’ columns or rows. In particular, if we are given \(n\) vectors that have \(m\) components and \(m \leq n\), we know immediately that the vectors will be linearly dependent and cannot be a basis for any vector space (though they may span the space). Conversely, if \(m \geq n\), we may have linear independence, but the space we span will be unclear; if the space is \(\mathbb{R}^m\), we certainly won’t have enough vectors to span it.

Bases for \(\mathbb{R}^n\)

Any linearly independent set of \(n\) vectors in \(\mathbb{R}^n\) is a basis for \(\mathbb{R}^n\). In particular, the \(n\) vectors

\[\begin{split} \begin{bmatrix} 1 \\ 0 \\ \vdots \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ \vdots \\ 0 \end{bmatrix}, \dots, \begin{bmatrix} 0 \\ 0 \\ \vdots \\ 1 \end{bmatrix} \end{split}\]

are linearly independent and span the space; these vectors are often referred to as the standard basis vectors.

Given another set of \(n\) linearly independent vectors in \(\mathbb{R}^n\), we can show that they span \(\mathbb{R}^n\) also, because Gaussian elimination will reduce them to these vectors.

Reduced Row Echelon Form of a Matrix#

Below we are going to consider several vector spaces associated with a matrix. In order to analyze these vector spaces, and in particular, to determine bases for them, it is helpful to put the matrix into a reduced form that we can obtain from Gaussian elimination called reduced row echelon form.

Definition: Reduced Row Echelon Form

A matrix \(R\) is said to be in reduced row echelon form if:

  • All rows containing only zeros are below any row containing a nonzero entry

  • The first nonzero entry in each row is 1

  • The first nonzero entry in each row occurs strictly to the right of the first nonzero entry in the row above

  • Each column containing the first nonzero entry in a row contains no other nonzero entries.

We can use Gaussian elimination to find a reduced row echelon form matrix \(R\) that is row equivalent to \(A\), and we write \(R=rref(A)\) to indicate this matrix.

Example: The matrix

\[\begin{split} R = \begin{bmatrix} 1 & 0 & 2 & 3 & 0 \\ 0 & 1 & 4 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix} \end{split}\]

is in reduced row echelon form.

Example: The matrix

\[\begin{split} S = \begin{bmatrix} 1 & 0 & 2 & 3 & 0 \\ 0 & 1 & 4 & 1 & 0 \\ 0 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix} \end{split}\]

is not in reduced row echelon form, because the first nonzero entry in row three has other nonzero entries in its column.

Column Space#

Matrices have several vector spaces associated with them. The first of these that we will study is called the column space.

Definition: Column Space

The column space of a matrix \(A\), written \(col(A)\), is the vector space spanned by its columns.

The column space is important because whenever a matrix multiplies a vector, the result is a linear combination of the columns of the matrix: \(A\mathbf{x} = \mathbf{b}\) is a terse way of saying that \(\mathbf{b}\) is a linear combination of the columns of \(A\).

Often the columns of \(A\) do not form a basis for \(col(A)\) because they are not linearly independent, and in order to better understand the column space we want to find a basis for it. To do this, we can reduce \(A\) to \(rref(A)\) using Gaussian elimination. In the reduced row echelon form of \(A\), the columns that contain pivots are independent of the other columns containing pivots, and we can map the column indices back to the original columns of \(A\) to identify a suitable basis for \(col(A)\).

Example: Suppose that we wish to find a basis for the column space of \(A\) given below:

\[\begin{split} A = \begin{bmatrix} 1 & 1 & 2 & 4 & 0 \\ 1 & 1 & 3 & 6 & 1 \\ 2 & 2 & 4 & 8 & 0 \end{bmatrix}. \end{split}\]

First, note that the columns of \(A\) have three components, so \(col(A) \subseteq \mathbb{R}^3\). Let’s go through Gaussian elimination to the furthest extent possible:

\[\begin{split} \begin{bmatrix} 1 & 1 & 2 & 4 & 0 \\ 1 & 1 & 3 & 6 & 1 \\ 2 & 2 & 4 & 8 & 0 \end{bmatrix} \sim \begin{bmatrix} 1 & 1 & 2 & 4 & 0 \\ 0 & 0 & 1 & 2 & 1 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix} \sim \begin{bmatrix} 1 & 1 & 0 & 0 & -2 \\ 0 & 0 & 1 & 2 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}. \end{split}\]

In \(rref(A)\), we find pivots in columns 1 and 3. The other columns are ‘free’ columns. Mapping back to the original columns of \(A\), we can say that the column vectors

\[\begin{split} A_1 = \begin{bmatrix} 1 \\ 1 \\ 2 \end{bmatrix}, A_3 = \begin{bmatrix} 2 \\ 3 \\ 4 \end{bmatrix} \end{split}\]

are a basis for \(col(A)\). Geometrically, \(col(A)\) is a plane in \(\mathbb{R}^3\). Algebraically, note that \(A_2 = A_1\), \(A_4 = 2A_3\), and \(A_5 = A_3 - 2A_1\). When we put \(A\) in \(rref(A)\), these dependencies lead to the lack of pivots in columns 2, 4, and 5, and this is why we can map the indices of the pivot columns in \(rref(A)\) back to \(A\) to identify the linearly independent basis vectors of \(col(A)\).

Example: Here is an example that is simpler in some ways but leads to a subtle and important idea. Let

\[\begin{split} A = \begin{bmatrix} 1 & 1 & -1 \\ 2 & 1 & 2 \\ -1 & 0 & 1 \end{bmatrix}. \end{split}\]

Note that again \(col(A) \subseteq \mathbb{R}^3\). As an exercise, go through the Gaussian elimination to find \(rref(A)\). You will find that it is simply \(I\), the \(3\times 3\) identity matrix. That means every column of \(A\) is a pivot column, and thus \(col(A)\) is a three-dimensional ‘subspace’ of \(\mathbb{R}^3\). That implies that \(col(A) = \mathbb{R}^3\), and we can use the pivot columns of \(A\) as a basis (as in the previous example) or we can just use the standard basis of \(\mathbb{R}^3\); that is, we can change the basis if we find that one is easier to use than another for downstream tasks.

The column space of a matrix is so important that its dimension gets its own definition as a key characteristic of the matrix:

Definition: Rank

The rank of a matrix \(A\), written \(rank(A)\), is the dimension of the column space of \(A\): \(rank(A) = dim(col(A))\).

Null Space#

The null space of a matrix is the vector space consisting of vectors \(\mathbf{x}\) such that \(A\mathbf{x} = \mathbf{0}\).

Definition: Null Space

Given a matrix \(A\), the null space of \(A\) is written \(N(A)\) and defined as follows:

\[ N(A) = \{\mathbf{x} \in \mathbb{R}^n | A\mathbf{x} = \mathbf{0}\}. \]

The null space may initially seem like a somewhat uninteresting vector space, but it turns out that many applications lead us into the null space of some matrix. To verify that the null space is indeed a vector space, note that the distributive property of matrix-vector multiplication insures that the null space is closed under linear combination. More precisely, if we let \(\mathbf{u},\mathbf{v} \in N(A)\) and let \(c_1, c_2\) be arbitrary scalars, then the linear combination \(c_1\mathbf{u} + c_2\mathbf{v} \in N(A)\), because

\[ A(c_1\mathbf{u} + c_2\mathbf{v}) = c_1A\mathbf{u} + c_2\mathbf{v} = c_1\mathbf{0} + c_2\mathbf{0} = \mathbf{0}. \]

Finding a basis for \(N(A)\) revolves around \(rref(A)\) also. First, note that if \(rref(A) = I\) (that is, every column of a square matrix \(A\) is a pivot column), then \(N(A)\) is what is known as the trivial vector space: the vector space containing only the zero vector. We are generally interested in the null space when \(A\) doesn’t have enough pivots. In this case, we have non-pivot columns referred to as free columns, and we can use these to construct a basis for \(N(A)\).

Example: Take the \(3\times 5\) matrix \(A\) from the earlier example, for which we have already obtained \(rref(A)\):

\[\begin{split} rref(A) = \begin{bmatrix} 1 & 1 & 0 & 0 & -2 \\ 0 & 0 & 1 & 2 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}. \end{split}\]

Suppose that above we had tried to solve \(A\mathbf{x} = \mathbf{0}\) instead of finding \(rref(A)\): we would have performed the same Gaussian elimination steps, but our matrix would have been augmented with a column of 0’s, which would not change the calculations at all (none of the row operations ever change the column of zeros). So let’s write the augmented reduced \(A\):

\[\begin{split} [rref(A) | \mathbf{0}] = \begin{bmatrix} 1 & 1 & 0 & 0 & -2 & | & 0 \\ 0 & 0 & 1 & 2 & 0 & | & 0 \\ 0 & 0 & 0 & 0 & 0 & | & 0 \end{bmatrix}. \end{split}\]

Now, consider that each we are looking at an augmented matrix that represents a linear system with infinitely many solutions. If we write down the equations explicitly, we have

\[\begin{split} \begin{align*} x_1 + x_2 - 2x_5 &= 0 \\ x_3 + 2x_4 &= 0 \\ \end{align*} \end{split}\]

We will refer to the variables \(x_i\) corresponding to the free columns (the non-pivot columns in \(rref(A)\)) as free variables, and the variables in pivot columns will be called pivot variables. The terminology here comes from the fact that we have the freedom to choose any values we want for free variables and then we can calculate the pivot variable values that lead to a solution to the system.

Although we have the freedom to choose the free variable values, some choices are better than others. With the goal in mind of obtaining a set of basis vectors, we will do the following: for each free variable \(x_i\):

  1. Set \(x_i = 1\) and set every other free variable equal to \(0\).

  2. Solve for the pivot variables.

  3. Put the values into a vector. This vector is a basis vector for \(N(A)\).

Continuing the example, we first set \(x_2 = 1\) and \(x_4 = x_5 = 0\). Then \(x_1 = -1\) and \(x_3 = 0\), and we have a basis vector

\[\begin{split} \mathbf{x}_1 = \begin{bmatrix} -1 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix}. \end{split}\]

Now set \(x_4 = 1\) and \(x_2 = x_5 = 0\). Then \(x_1 = 0\) and \(x_3 = -2\), and we have a second basis vector

\[\begin{split} \mathbf{x}_2 = \begin{bmatrix} 0 \\ 0 \\ -2 \\ 1 \\ 0 \end{bmatrix}. \end{split}\]

Finally, set \(x_5 = 1\) and \(x_2 = x_4 = 0\). Then \(x_1 = 2\) and \(x_3 = 0\), and we have a third basis vector

\[\begin{split} \mathbf{x}_3 = \begin{bmatrix} 2 \\ 0 \\ 0 \\ 0 \\ 1 \end{bmatrix}. \end{split}\]

There are no more free variables, so we stop here and assert that the three vectors \(\mathbf{x}_1, \mathbf{x}_2, \mathbf{x}_3\) form a basis for \(N(A)\).

The previous statement is as yet unsupported. As an exercise, it is worthwhile to go back and verify that \(A\mathbf{x}_i = \mathbf{0}\) for the three vectors we just produced, so they are definitely in the \(N(A)\). Building on that, note that the choices we made in constructing these three vectors force them to be linearly independent (observe the pattern of zeros in the free variable positions). The only remaining question is whether these vectors span \(N(A)\), or more precisely, is \(dim(N(A)) = 3\)? We can see that \(N(A) \subseteq \mathbb{R}^3\), but how do we know that it is a three-dimensional subspace of \(\mathbb{R}^3\) and not larger, perhaps all of \(\mathbb{R}^3\) itself?

There is an important theorem that will answer this question precisely.

Theorem

Let \(A\) be an \(m\times n\) matrix. Then \(dim(col(A)) + dim(N(A)) = n\). Equivalently using rank, \(rank(A) = n - dim(N(A))\).

Note that for the preceding example, the theorem establishes that \(dim(N(A)) = 3\), as we know already that \(rank(A) = 2\), answering the only remaining question about our proposed basis for \(N(A)\).

We will prove this theorem in a subsequent section after we have considered two other vector spaces associated with a matrix.