The Matrix Inverse#
In the previous section we introduced the idea of an ‘inverse’ elimination matrix; that is, a matrix \(E_{ij}^{-1}\) that would reverse whatever row operation \(E_{ij}\) performed on a matrix \(A\):
In fact, many matrices, not just elimination matrices, have inverses that reverse whatever action those matrices perform when they multiply other matrices or vectors.
Definition
Given a square matrix \(A\), if there exists a matrix \(A^{-1}\) such that \(A^{-1}A = AA^{-1} = I\), then we call \(A^{-1}\) the inverse of A, and we say that \(A\) is an invertible or nonsingular matrix. If no such matrix exists, we say that $A is noninvertible or singular.
Don’t Use the Inverse to Solve a Linear System#
Suppose that one had a linear system \(A\mathbf{x} = \mathbf{b}\). If \(A\) is invertible, then we could in theory use \(A^{-1}\) to solve the system:
This is theoretically correct and is fine for small examples done by hand, but it is almost never actually done in practice for reasons of efficiency and accuracy; for example, Intel’s LAPACK documentation of the matrix inversion routine states the following:
… do not attempt to solve a system of equations \(A\mathbf{x} = \mathbf{b}\) by first computing \(A^{-1}\) and then forming the matrix-vector product \(\mathbf{x} = A^{-1}\mathbf{b}\). Call a solver routine instead (see Routines for Solving Systems of Linear Equations); this is more efficient and more accurate.
They also note that:
It is seldom necessary to compute an explicit inverse of a matrix.
Indeed, while many of the applications that we are going to consider include inverse matrix calculations, we will almost always seek methods that avoid doing the inverse matrix calculations out in full for efficiency and numeric stability.
Matrix Inverse - Theory#
The matrix inverse does play an important part in the theory of linear algebra, so while we will aim to exclude it from actual calculations and algorithms, it is important to know its properties and pathologies.
Not All Matrices Have Inverses
First, all matrices have inverses. Our definition already excludes non-square matrices; but many square matrices are singular also: if the columns of the matrix are linearly dependent, the matrix is singular.
With small examples, this can often be spotted right away:
Example:
is singular: the columns are scalar multiples of each other, and hence linearly dependent.
Equivalently, notice that if \(A\) is invertible, then the equation \(A\mathbf{x} = \mathbf{0}\) can only have one solution: \(\mathbf{x} = \mathbf{0}\), because
But linearly dependence of a matrix’s columns means exactly that there are nonzero solutions to \(A\mathbf{x} = \mathbf{0}\); for example, in the example above we can find
as one nonzero solution (there are infinitely many others).
Inverses of Products Reverse Order
Inverses of products reverse order: \((AB)^{-1} = B^{-1}A^{-1}\). We saw this earlier with elimination matrices and their inverses, but it is a general property of invertible matrices, which is illustrated in the following \(2\times2\) example.
Example: Let
Then
(Take a moment here to verify that \(AA^{-1} = A^{-1}A = I = BB^{-1} = B^{-1}B\)). Now,
and you can further verify that
Calculating a Matrix Inverse#
Although after this we will try to avoid ever calculating an inverse matrix for anything larger than \(2\times2\) in full, it is worthwhile to go through the computation once as it illustrates the important point that almost everything in linear algebra boils down to solving a linear system, and here we can exploit the \(A=LU\) factorization for efficiency on the way. For an \(n\times n\) matrix \(A\), the inverse \(A^{-1}\) must have \(k^{th}\) column \(A^{-1}_k\) such that \(AA^{-1}_k\) is the \(k^{th}\) column of the \(n\times n\) identity matrix. So it is necessary to solve \(n\) linear systems of the form
where the 1 in the vector above occurs in the \(k^{th}\) component. \(A = LU\) is commonly used to speed up computation when we need to solve \(A\mathbf{x} = \mathbf{b}\) for one \(A\) but many \(\mathbf{b}\), and that is exactly the situation we find ourselves in here. Recall the approach:
First obtain the factorization \(A = LU\) (this is usually the most expensive part). Then, for each \(\mathbf{b}\) for which we need to solve \(A\mathbf{x} = \mathbf{b}\),
Set \(U\mathbf{x} = \mathbf{y};\)
Solve \(L\mathbf{y} = \mathbf{b}\) for \(\mathbf{y};\)
Solve \(U\mathbf{x} = \mathbf{y}\) for \(\mathbf{x}.\)
Example: Calculate the inverse of
\(A\) has the following \(LU\) factorization (verify this!):
Start with
We need to solve \(L\mathbf{y} = \mathbf{b}\):
and thanks to the triangularity of \(L\) we can back-substitute to obtain \(y_1 = 1\), \(y_2 = -2\), and \(y_3 = -1\). Now solve \(U\mathbf{x} = \mathbf{y}\):
and again using back-substitution we can obtain \(x_3 = -1/3\), \(x_2=4/9\), and \(x_1=-2/9\). So the first column of \(A^{-1}\) is
If we repeat this process two more times, using next \(\mathbf{b} = [0\hspace{0.5em}1\hspace{0.5em}0]^T\) and finally \(\mathbf{b}=[0\hspace{0.5em}0\hspace{0.5em}1]^T\), then the three solutions obtained will be the columns of the inverse of \(A\).
Exercise: Complete the computation of the remaining columns of \(A^{-1}\) in the example above. Use the \(LU\) factorization provided. You should end up with
Verify that \(AA^{-1} = A^{-1}A = I\).
\(A=LU\) vs. Gaussian Elimination
Students often are first taught to calculate the inverse using Gaussian elimination on the augmented matrix \([A | I]\). Briefly, if you augment \(A\) with the \(n\times n\) identity matrix and perform Gaussian elimination you can transform the augmented matrix to \([I | A^{-1}]\), from which you can then just extract \(A^{-1}\). This is another example of an approach which is fine for small examples done by hand but isn’t actually used in practice; \(A=LU\) is both more efficient and more numerically stable.
A \(2\times2\) Shortcut#
Many examples in this book use \(2\times2\) matrices, and it is helpful to know that there exists a convenient shortcut formula for finding the inverse of a \(2\times2\) matrix.
\(2\times2\) Inverse Formula
Let
Then
provided that \(ad-bc \neq 0\); otherwise \(A\) is singular.
Example: Refer back to the second example in this section, where we verified that \((AB)^{-1} = B^{-1}A^{-1}\) for a specific \(A\) and \(B\). Above I provided \(A^{-1}\) and \(B^{-1}\), so here let’s simply confirm that the \(2\times2\) shortcut formula works:
so \(a=1\), \(b=-1\), \(c=2\), and \(d=-1\). According to the formula
where the scalar in front of the matrix has been omitted because it simply equals 1. This is same \(A^{-1}\) as above, which you have (or should have!) already confirmed satisfies \(AA^{-1} = A^{-1}A = I\).
Exercise: For the \(B\) in the second example, calculate \(B^{-1}\) using the shortcut formula and confirm that it matches what you already verified satisfied \(BB^{-1} = B^{-1}B = I\).