Up to this point, we have used the Gaussian elimination algorithm to find solutions to linear systems, which is equivalent to solving matrix equations of the form \(A \xvec = \bvec\text{,}\) where \(\bvec\) is a vector of constants, \(A\) a matrix of coefficients, and \(\xvec\) a vector of variables, or unknowns. We now investigate another way to find solutions to the equation \(A\xvec=\bvec\) when the matrix \(A\) is square. To get started, let’s look at some familiar examples.
Preview Activity4.1.1.
Explain how you would solve the equation \(3x =
5\) using multiplication rather than division.
Find the \(2\by2\) matrix \(A\) that rotates vectors counterclockwise by \(90^\circ\text{.}\)
Find the \(2\by2\) matrix \(B\) that rotates vectors clockwise by \(90^\circ\text{.}\)
What do you expect the product \(AB\) to be? Explain the reasoning behind your expectation and then compute \(AB\) to verify it.
Solve the equation \(A\xvec = \twovec{3}{-2}\) using Gaussian elimination.
Explain why your solution may also be found by computing \(\xvec = B\twovec{3}{-2}\text{.}\)
As we have seen a few times, the matrix is \(A=\left[\begin{array}{rr}
0 \amp -1 \\
1 \amp 0 \\
\end{array}\right]
\text{.}\)
Here, the matrix is \(B=\left[\begin{array}{rr}
0 \amp 1 \\
-1 \amp 0 \\
\end{array}\right]
\text{.}\)
We should expect that \(AB=I\) since the effect of rotating by \(90^\circ\) clockwise followed by rotating \(90^\circ\) counterclockwise is to leave a vector unchanged. We can verify this by performing the matrix multiplication.
so the solution is \(\xvec=\twovec{-2}{-3}\text{.}\)
The equation \(A\xvec=\twovec{3}{-2}\) is asking us to find the vector that becomes \(\twovec{3}{-2}\) after being rotated by \(90^\circ\text{.}\) If we rotate \(\twovec{3}{-2}\) by \(90^\circ\) in the opposite direction, it will have this property. That is, if \(\xvec = B\twovec{-2}{-3}\text{,}\) then
The preview activity began with a familiar type of equation, \(3x = 5\text{,}\) and asked for a strategy to solve it. One possible response is to divide both sides by 3. Instead, let’s rephrase this as multiplying by \(3^{-1} = \frac
13\text{,}\) the multiplicative inverse of 3.
Now that we are interested in solving equations of the form \(A\xvec = \bvec\text{,}\) we might try to find a similar approach. Is there a matrix \(A^{-1}\) that plays the role of the multiplicative inverse of \(A\text{?}\) Of course, the real number \(0\) does not have a multiplicative inverse so we probably shouldn’t expect every matrix to have a multiplicative inverse. We will see, however, that many do.
Definition4.1.1.
Let \(A\) be an \(m \by n\) matrix. If \(LA = I_n\text{,}\) then we call \(L\) a left inverse of \(A\text{.}\) If \(AR = I_m\text{,}\) then we call \(R\) a right inverse of \(A\text{.}\)
Activity4.1.2.
If \(A\) is \(m \by n\) and has a left inverse \(L\text{,}\) what shape must \(L\) be?
If \(A\) is \(m \by n\) and has a right inverse \(R\text{,}\) what shape must \(R\) be?
In both cases, \(n \by m\text{,}\) the reverse of the shape of \(A\text{.}\) For example, a left inverse \(L\) must have \(m\) columns because \(A\) has \(m\) rows. And the product \(LA\) will have \(n\) columns because \(A\) has \(n\) coloumns. Since an idenity matrix is square, the product also has \(n\) rows. This implies the \(L\) has \(n\) rows.
A similar argument shows that \(R\) must be \(n \by m\text{.}\)
Activity4.1.3.
Show that if a matrix \(A\) has left inverse \(L\) and right inverse \(R \text{,}\) then \(L = R \text{.}\) Hint: Consider the product \(LAR\text{.}\)
Show that if an \(m \by n\) matrix \(A\) has right inverse \(R\text{,}\) then \(n \ge m\)
Show that if an \(m \by n\) matrix \(A\) has left inverse \(L\text{,}\) then \(m \ge n\text{.}\) Hint: What does \(LA\xvec = L\bvec\) tell you about the equation \(A \xvec = \bvec\text{?}\)
Show that if an \(m \by n\) matrix \(A\) has both a left and a right inverse, then \(m = n\text{,}\) so \(A\) is a square matrix.
\begin{align*}
LAR \amp = (LA)R = I R = R\\
LAR \amp = L(AR) = L I = L
\end{align*}
so \(L = R\)
If \(AR = I\text{,}\) then for any \(\bvec \in \real^m\text{,}\)\(AR\bvec = I \bvec = \bvec\text{.}\) This means that \(A \xvec = \bvec\) is consistent for every \(\bvec \in \real^m\text{.}\) That means every row of the RREF of \(A\) has a pivot position. So there are \(m\) pivots. There can be at most one pivot in each column, so there must be at least as many columns as pivots: \(n \ge m\text{.}\)
If \(A\) has a left inverse \(L\text{,}\) then \(L\) has a right inverse (namely \(A\)). So \(L\) must have at least as many columns as rows, which means \(A\) must have at least as many rows as columns: \(m \ge n\)
Combine the two previous parts.
Activity 4.1.3 tells us two important pieces of information about matrices that have both a left and a right inverse.
Proposition4.1.2.
If a matrix \(A\) has left inverse \(L\) and right inverse \(R \text{,}\) then
\(L = R\text{.}\)
\(A\) is square.
This leads to the following definition
Definition4.1.3.
An \(n \by n\) square matrix \(A\) is called invertible if there is a matrix \(B\) that is both a left and a right inverse for \(A\text{.}\) That is \(BA = I_n\) and \(AB = I_n\text{.}\) The matrix \(B\) is called the inverse of \(A\) and denoted \(A^{-1}\text{.}\)
Proposition4.1.4.
If \(A\) is a \(n\by n\) invertible matrix with inverse \(B\text{,}\) then \(B\) is also invertbile and the inverse of \(B\) is \(A\text{.}\) In other words,
\begin{equation*}
(A^{-1})^{-1} = A.
\end{equation*}
If \(B\) is the inverse of \(A\text{,}\) then \(AB = I\) and \(BA = I\text{.}\) But this means that \(A\) is the inverse of \(B\) as well.
We have seen that if a matrix has a left and a right inverse, then these must be the same matrix. In fact, more is true. For square matrices, any right inverse is also a left inverse, and any left inverse is also a right inverse.
It is important to remember that the product of two matrices depends on the order in which they are multiplied. That is, if \(C\) and \(D\) are matrices, then it sometimes happens that \(CD \neq DC\text{,}\) even if the matrices are square. So it is not immediate that \(RA = I\) just becuase \(AR = I\text{.}\) For matrices that are not square, \(AR\) and \(RA\) won’t even be the same shape, and at most one of the two products will be an identity matrix. So Proposition 4.1.5 establishes something special about one-sided inverses of square matrices.
Proposition4.1.5.
If \(A\) is an \(n \by n\) square matrix and \(AR = I_n\text{,}\) then \(RA = I_n\text{.}\) Similarly if \(LA = I_n\) then \(AL = I_n\text{.}\)
Suppose \(A\) is \(n \by n\) and has a right inverse \(R\text{.}\) Then for any \(\bvec \in \real^n\text{,}\)\(AR\bvec = \bvec\text{.}\) So \(A \xvec = \bvec\) is consistent for every \(\bvec\text{.}\) This implies that the RREF of \(A\) has a pivot in each row, and therefore also in each column. In other words, \(A \sim I\text{.}\)
Notice that if \(R\xvec = R \yvec\text{,}\) then \(\xvec = AR \xvec = AR \yvec = \yvec\text{.}\) This means the matrix transformation associated with \(R\) is one-to-one, so the RREF for \(R\) has a pivot in every column, and therefore in every row. In other words, \(R \sim I\text{.}\)
Now consider \(RA \xvec\) for any \(\xvec \in \real^n\text{.}\) Because \(R \sim I\text{,}\) there must be a \(\yvec\) with \(R \yvec = \xvec\text{.}\) So
\begin{equation*}
RA \xvec = RAR \yvec = R \yvec = \xvec
\end{equation*}
But this means that \(RA\) is the identity matrix since the transformation associated with it is the identity transformation. In other words, \(RA = I\) and \(R\) is also a left inverse of \(A\)
Now suppose that \(L\) is a left inverse for \(A\text{.}\) That means \(A\) is a right inverse for \(L\text{,}\) and we just show that this means \(A\) is also a left inverse for \(L\text{.}\) So \(AL = I\) and \(L\) is a right inverse for \(A\text{.}\)
Proposition 4.1.5 means that to show that matrix is invertible, it suffices to find a one-sided inverse -- it will always also be a full inverse.
\begin{equation*}
A B_2 = I_2\text{,}
\end{equation*}
So \(A\) has more than one right inverse. In fact, there are infinitely many and none of them can be a left inverse, since if \(L\) had both a left and a right inverse, they would have to be the same by Proposition 4.1.2. But that means a left inverse would have to be the same as both \(B\) and \(B_2\text{,}\) which is not posssible.
Example4.1.7.
Suppose that \(A\) is the matrix that rotates two-dimensional vectors counterclockwise by \(90^\circ\) and that \(B\) rotates vectors by \(-90^\circ\text{.}\) We have
which says that (a) \(B\) is also the left inverse of \(A\text{,}\) and (b) \(A\) is the inverse of \(B\) as well. Inverses always come in pairs like this. If \(A\) is invertible with inverse \(B\text{,}\) then \(B\) is also invertible and its inverse is \(A\text{.}\) In other words, \(A\) and \(B\) are inverses of each other.
If we think about the matrix transformations associated with \(A\) and \(B\text{,}\) this makes geometric sense. Rotating clockwise "undoes" a counterclockwise rotation. Rotating counterclockwise "undoes" a clockwise rotation.
Subsection4.1.2Solving equations with an inverse
If \(A\) is an invertible matrix, then for any \(\bvec \in \real^n\text{,}\)
\begin{equation*}
A (A^{-1} \bvec) = (A A^{-1}) \bvec = \bvec\text{,}
\end{equation*}
so \(A \xvec = \bvec\) is consistent for every \(\bvec\text{.}\) Furthermore,
\begin{equation*}
A \bvec = \xvcec \Rightarrow A^{-1} A \bvec = A^{-1} \xvec \Rightarrow \bvec = A \xvec\text{,}
\end{equation*}
so solutions are unique.
This result is import enough to capture in a proposition.
Proposition4.1.8.
If \(A\) is an invertible \(n \by n\) matrix with inverse \(A^{-1}\text{,}\) and \(\bvec \in \real^n\text{,}\) then the equation \(A\xvec = \bvec\) is consistent and \(\xvec = A^{-1} \bvec\) is the unique solution.
Notice that this is similar to saying that the solution to \(3x=5\) is \(x = \frac13\cdot 5\text{,}\) as we saw in the preview activity.
Proposition 4.1.8 also implies that \(A \sim I\) for any invertible matrix \(A\) since \(A\) must have a pivot position in each row and in each column.
You may have noticed that Proposition 4.1.8 says that the solution to the equation \(A\xvec = \bvec\) is \(\xvec =
A^{-1}\bvec\text{.}\) Indeed, we know that this equation has a unique solution because \(A\) has a pivot position in every column.
Proposition 4.1.8 shows us how to use \(A^{-1}\) to solve equations of the form \(A\xvec = \bvec\text{.}\) In Subsection 4.1.3 we will learn how to find an inverse for any square matrix that has an inverse. But before we get to that, let’s see some examples of how to use an inverse once we have it.
Now use \(A^{-1}\) to solve the equation \(A\xvec = \threevec343\) and verify that your result agrees with what you found in part a.
We’ll learn a way to compute the inverse of a matrix shortly. NumPy knows how to do this, of course. If you have defined a matrix B in Python, you can find it’s inverse as np.linalg.inv(B). Use Python to find the inverse of the matrix
Now that we have seen one use of inverses -- solving equations of the form \(A \xvec = \bvec\) -- we turn our attention to the task of computing the inverse of a matrix, or demonstrating that no inverse exists.
Activity4.1.5.
This activity demonstrates a procedure for finding the (right) inverse of a matrix \(A\text{.}\)
Suppose that \(A = \begin{bmatrix}
3 \amp -2 \\
1 \amp -1 \\
\end{bmatrix}
\text{.}\) To find a right inverse \(B\text{,}\) we write its columns as \(B = \begin{bmatrix}\bvec_1 \amp \bvec_2
\end{bmatrix}\) and require that
Solve these equations to find \(\bvec_1\) and \(\bvec_2\text{.}\) Then write the matrix \(B\) and verify that \(AB=I\text{.}\)
By Proposition 4.1.5 this is enough for us to conclude that \(B\) is the inverse of \(A\text{.}\) But le’ts compute the product \(BA\) just to confirm.
What happens when you try to find the inverse of \(C = \begin{bmatrix}
-2 \amp 1 \\
4 \amp -2 \\
\end{bmatrix}\text{?}\)
We now develop a condition that must be satisfied by an invertible matrix. Suppose that \(A\) is an invertible \(n\by n\) matrix with inverse \(B\) and suppose that \(\bvec\) is any \(n\)-dimensional vector. Since \(AB=I\text{,}\) we have
Solving the two equations for \(\bvec_1\) and \(\bvec_2\) gives \(B = \begin{bmatrix}
1 \amp -2 \\
1 \amp -3 \\
\end{bmatrix}\text{.}\) We can verify that, as we expect, \(AB=I\text{.}\)
We find that \(BA=I\text{,}\) which is the condition that tells us that \(B\) is invertible.
Seeking the first column of \(C^{-1}\text{,}\) we see that the equation \(C\xvec = \twovec10\) is not consistent. This means that \(C\) is not invertible.
Since the equation \(A\xvec = \bvec\) is consistent for every \(\bvec\text{,}\) we know that the span of the columns of \(A\) is \(\real^n\text{.}\)
Because the span of the columns of \(A\) is \(\real^n\text{,}\) there is a pivot position in every row. Since \(A\) is square, there is also a pivot position in every column. This means that the reduced row echelon form of \(A\) must be the identity matrix \(I_n\text{.}\)
which shows that \(A\) is invertible and \(C\) is not.
Example4.1.9.
We can reformulate this procedure for finding the inverse of a matrix. For the sake of convenience, suppose that \(A\) is a \(2\by2\) invertible matrix with inverse \(B=\begin{bmatrix} \bvec_1 \amp \bvec_2 \end{bmatrix}\text{.}\) Rather than solving the equations
separately, we can solve them at the same time by augmenting \(A\) by both vectors \(\twovec10\) and \(\twovec01\) and finding the reduced row echelon form.
For example, if \(A =
\begin{bmatrix}
1 \amp 2 \\
1 \amp 1 \\
\end{bmatrix}\text{,}\) we form
This shows that the matrix \(B =
\begin{bmatrix}
-1 \amp 2 \\
1 \amp 1 \\
\end{bmatrix}\) is the inverse of \(A\text{.}\)
In other words, beginning with \(A\text{,}\) we augment by the identify and find the reduced row echelon form to determine \(A^{-1}\text{:}\)
\begin{equation*}
\left[
\begin{array}{r|r}
A \amp I \\
\end{array}
\right]
\sim
\left[
\begin{array}{r|r}
I \amp A^{-1} \\
\end{array}
\right].
\end{equation*}
This reformulation will always work.
Proposition4.1.10.
The matrix \(A\) is invertible if and only if the reduced row echelon form of \(A\) is the identity matrix: \(A\sim I\text{.}\) In addition, we can find the inverse by augmenting \(A\) by the identity and finding the reduced row echelon form:
\begin{equation*}
\left[
\begin{array}{r|r}
A \amp I \\
\end{array}
\right]
\sim
\left[
\begin{array}{r|r}
I \amp A^{-1} \\
\end{array}
\right].
\end{equation*}
The next proposition summarizes much of what we have found about invertible matrices.
Proposition4.1.11.Properties of invertible matrices.
An \(n\by n\) matrix \(A\) is invertible if and only if \(A\sim I\text{.}\)
If \(A\) is invertible, then the solution to the equation \(A\xvec = \bvec\) is given by \(\xvec =
A^{-1}\bvec\text{.}\)
We can find \(A^{-1}\) by finding the reduced row echelon form of \(\left[\begin{array}{r|r} A \amp I
\end{array}\right]\text{;}\) namely,
\begin{equation*}
\left[\begin{array}{r|r} A \amp I \end{array}\right]
\sim
\left[\begin{array}{r|r} I \amp A^{-1} \end{array}\right]\text{.}
\end{equation*}
If \(A\) and \(B\) are two invertible \(n\by
n\) matrices, then their product \(AB\) is also invertible and \((AB)^{-1} = B^{-1}A^{-1}\text{.}\)
Remark4.1.12.Formulas for inverse matrices.
There is a simple formula for finding the inverse of a \(2\by2\) matrix:
\begin{equation*}
\left[\begin{array}{rr}
a \amp b \\
c \amp d \\
\end{array}\right]^{-1}
=
\frac{1}{ad-bc}
\left[\begin{array}{rr}
d \amp -b \\
-c \amp a \\
\end{array}\right]\text{,}
\end{equation*}
which can be easily checked. The condition that \(A\) be invertible is, in this case, reduced to the condition that \(ad-bc\neq 0\text{.}\) We will understand this condition better once we have explored determinants in Section 4.5. There is a similar formula for the inverse of a \(3\by
3\) matrix, but there is not a good reason to write it here.
Subsection4.1.4Summary
In this section, we found conditions guaranteeing that a matrix has an inverse. When these conditions hold, we also found an algorithm for finding the inverse.
A square matrix is invertible if there is a matrix \(B\text{,}\) known as the inverse of \(A\text{,}\) such that \(AB =
I\text{.}\) We usually write \(A^{-1} = B\text{.}\)
The \(n\by n\) matrix \(A\) is invertible if and only if it is row equivalent to \(I_n\text{,}\) the \(n\by n\) identity matrix.
If a matrix \(A\) is invertible, we can use Gaussian elimination to find its inverse:
\begin{equation*}
\left[\begin{array}{r|r} A \amp I \end{array}\right] \sim
\left[\begin{array}{r|r} I \amp A^{-1} \end{array}\right]\text{.}
\end{equation*}
If a matrix \(A\) is invertible, then the solution to the equation \(A\xvec = \bvec\) is \(\xvec =
A^{-1}\bvec\text{.}\)
Find the inverse of \(A\) by augmenting by the identity \(I\) to form \(\left[\begin{array}{r|r}A \amp
I \end{array}\right]\text{.}\)
Use your inverse to solve the equation \(A\xvec =
\fourvec{3}{2}{-3}{-1}\text{.}\)
2.
In this exercise, we will consider \(2\by 2\) matrices as defining matrix transformations.
Write the matrix \(A\) that performs a \(45^\circ\) rotation. What geometric operation undoes this rotation? Find the matrix that perform this operation and verify that it is \(A^{-1}\text{.}\)
Write the matrix \(A\) that performs a \(180^\circ\) rotation. Verify that \(A^2 = I\) so that \(A^{-1} = A\text{,}\) and explain geometrically why this is the case.
Find three more matrices \(A\) that satisfy \(A^2
= I\text{.}\)
3.
Inverses for certain types of matrices can be found in a relatively straightforward fashion.
The matrix \(D=\begin{bmatrix}
2 \amp 0 \amp 0 \\
0 \amp -1 \amp 0 \\
0 \amp 0 \amp -4 \\
\end{bmatrix}\) is called diagonal since the only nonzero entries are on the diagonal of the matrix.
Find \(D^{-1}\) by augmenting \(D\) by the identity and finding its reduced row echelon form.
Under what conditions is a diagonal matrix invertible?
Explain why the inverse of an invertible diagonal matrix is also diagonal and explain the relationship between the diagonal entries in \(D\) and \(D^{-1}\text{.}\)
Find \(L^{-1}\) by augmenting \(L\) by the identity and finding its reduced row echelon form.
Explain why the inverse of an invertible lower triangular matrix is also lower triangular.
How can we tell whether a triangular matrix is invertible?
4.
Our definition of an invertible matrix requires that \(A\) be a square \(n\by n\) matrix. Let’s examine what happens when \(A\) is not square. For instance, suppose that
Verify that \(BA = I_2\text{.}\) In this case, we say that \(B\) is a left inverse of \(A\text{.}\)
If \(A\) has a left inverse \(B\text{,}\) we can still use it to find solutions to linear equations. If we know there is a solution to the equation \(A\xvec = \bvec\text{,}\) we can multiply both sides of the equation by \(B\) to find \(\xvec = B\bvec\text{.}\)
Suppose you know there is a solution to the equation \(A\xvec = \threevec{-1}{-3}{6}\text{.}\) Use the left inverse \(B\) to find \(\xvec\) and verify that it is a solution.
and verify that \(C\) is also a left inverse of \(A\text{.}\) This shows that the matrix \(A\) may have more than one left inverse.
5.
Suppose that \(A\) is an \(n\by n\) matrix.
Suppose that \(A^2 = AA\) is invertible with inverse \(B\text{.}\) This means that \(A^2B = AAB = I\text{.}\) Explain why \(A\) must be invertible with inverse \(AB\text{.}\)
Suppose that \(A^{100}\) is invertible with inverse \(B\text{.}\) Explain why \(A\) is invertible. What is \(A^{-1}\) in terms of \(A\) and \(B\text{?}\)
6.
Determine whether the following statements are true or false and explain your reasoning.
If \(A\) is invertible, then the columns of \(A\) are linearly independent.
If \(A\) is a square matrix whose diagonal entries are all nonzero, then \(A\) is invertible.
If \(A\) is an invertible \(n\by n\) matrix, then span of the columns of \(A\) is \(\real^n\text{.}\)
If \(A\) is invertible, then there is a nonzero solution to the homogeneous equation \(A\xvec =
\zerovec\text{.}\)
If \(A\) is an \(n\by n\) matrix and the equation \(A\xvec = \bvec\) has a solution for every vector \(\bvec\text{,}\) then \(A\) is invertible.
7.
Provide a justification for your response to the following questions.
Suppose that \(A\) is a square matrix with two identical columns. Can \(A\) be invertible?
Suppose that \(A\) is a square matrix with two identical rows. Can \(A\) be invertible?
Suppose that \(A\) is an invertible matrix and that \(AB = AC\text{.}\) Can you conclude that \(B = C\text{?}\)
Suppose that \(A\) is an invertible \(n\by
n\) matrix. What can you say about the span of the columns of \(A^{-1}\text{?}\)
Suppose that \(A\) is an invertible matrix and that \(B\) is row equivalent to \(A\text{.}\) Can you guarantee that \(B\) is invertible?
8.
We say that two square matrices \(A\) and \(B\) are similar if there is an invertible matrix \(P\) such that \(B = PAP^{-1}\text{.}\)
If \(A\) and \(B\) are similar, explain why \(A^2\) and \(B^2\) are similar as well. In particular, if \(B = PAP^{-1}\text{,}\) explain why \(B^2 =
PA^2P^{-1}\text{.}\)
If \(A\) and \(B\) are similar and \(A\) is invertible, explain why \(B\) is also invertible.
If \(A\) and \(B\) are similar and both are invertible, explain why \(A^{-1}\) and \(B^{-1}\) are similar.
If \(A\) is similar to \(B\) and \(B\) is similar to \(C\text{,}\) explain why \(A\) is similar to \(C\text{.}\) To begin, you may wish to assume that \(B =
PAP^{-1}\) and \(C = QBQ^{-1}\text{.}\)
9.
Suppose that \(A\) and \(B\) are two \(n\by n\) matrices and that \(AB\) is invertible. We would like to explain why both \(A\) and \(B\) are invertible.
We first explain why \(B\) is invertible.
Since \(AB\) is invertible, explain why any solution to the homogeneous equation \(AB\xvec = \zerovec\) is \(\xvec=\zerovec\text{.}\)
Use this fact to explain why any solution to \(B\xvec = \zerovec\) must be \(\xvec=\zerovec\text{.}\)
Explain why \(B\) must be invertible.
Now we explain why \(A\) is invertible.
Since \(AB\) is invertible, explain why the equation \(AB\xvec=\bvec\) is consistent for every vector \(\bvec\text{.}\)
Using the fact that \(AB\xvec = A(B\xvec) =
\bvec\) is consistent for every \(\bvec\text{,}\) explain why every equation \(A\xvec = \bvec\) is consistent.