In this section, we will revisit the theory of eigenvalues and eigenvectors for the special class of matrices that are symmetric, meaning that the matrix equals its transpose. Recall from Section 7.1 that covariance matrices are an important special case of symmetric matrix. Understanding symmetric matrices will enable us to form singular value decompositions later in the chapter.
To begin, remember that if \(A\) is a square matrix, we say that \(\vvec\) is an eigenvector of \(A\) with associated eigenvalue \(\lambda\) if \(A\vvec=\lambda\vvec\text{.}\) In other words, for these special vectors, the operation of matrix multiplication simplifies to scalar multiplication.
Preview Activity7.2.1.
This preview activity reminds us how a basis of eigenvectors can be used to relate a square matrix to a diagonal one.
Suppose that \(D=\begin{bmatrix}
3 \amp 0 \\
0 \amp -1
\end{bmatrix}\) and that \(\evec_1 = \twovec10\) and \(\evec_2=\twovec01\text{.}\)
Sketch the vectors \(\evec_1\) and \(D\evec_1\) on the left side of Figure 7.2.1.
Sketch the vectors \(\evec_2\) and \(D\evec_2\) on the left side of Figure 7.2.1.
Sketch the vectors \(\evec_1+2\evec_2\) and \(D(\evec_1+2\evec_2)\) on the left side.
Give a geometric description of the matrix transformation defined by \(D\text{.}\)
Now suppose we have vectors \(\vvec_1=\twovec11\) and \(\vvec_2=\twovec{-1}1\) and that \(A\) is a \(2\by2\) matrix such that
\(D\) stretches vectors horizontally by a factor of 3 and reflects them in the horizontal axis.
\(A\) stretches vectors in the direction of \(\vvec_1\) by a factor of 3 and reflects them in the line defined by \(\vvec_1\text{.}\)
The effect of the two transformations are the same when viewed in the coordinate systems given by the appropriate set of vectors.
The preview activity asks us to compare the matrix transformations defined by two matrices, a diagonal matrix \(D\) and a matrix \(A\) whose eigenvectors are given to us. The transformation defined by \(D\) stretches horizontally by a factor of 3 and reflects in the horizontal axis, as shown in Figure 7.2.2
By contrast, the transformation defined by \(A\) stretches the plane by a factor of 3 in the direction of \(\vvec_1\) and reflects in the line defined by \(\vvec_1\text{,}\) as seen in Figure 7.2.3.
In this way, we see that the matrix transformations defined by these two matrices are equivalent after a \(45^\circ\) rotation. This notion of equivalence is what we called similarity in Section 5.3. There we considered a square \(m\by m\) matrix \(A\) that provided enough eigenvectors to form a basis of \(\real^m\text{.}\) For example, suppose we can construct a basis for \(\real^m\) using eigenvectors \(\vvec_1,\vvec_2,\ldots,\vvec_m\) having associated eigenvalues \(\lambda_1,\lambda_2,\ldots,\lambda_m\text{.}\) Forming the matrices,
which tells us that \(A=PDP^{-1} =
\begin{bmatrix}
1 \amp 2 \\
2 \amp 1
\end{bmatrix}
\text{.}\)
Notice that the matrix \(A\) has eigenvectors \(\vvec_1\) and \(\vvec_2\) that not only form a basis for \(\real^2\) but, in fact, form an orthogonal basis for \(\real^2\text{.}\) Given the prominent role played by orthogonal bases in the last chapter, we would like to understand what conditions on a matrix enable us to form an orthogonal basis of eigenvectors.
Subsection7.2.1Symmetric matrices and orthogonal diagonalization
Let’s begin by looking at some examples in the next activity.
Activity7.2.2.
Remember that the Python command scipy.linalg.eig(A) attempts to find a basis for \(\real^m\) consisting of eigenvectors of \(A\text{.}\) If successful, e, E = linalg.eig(A) provides a vector of eigen values e and a matrix E containing the associated eigenvectors as columns.
For each of the following matrices, determine whether there is a basis for \(\real^2\) consisting of eigenvectors of that matrix. When there is such a basis, form the matrices \(P\) and \(D\) such that \(A = PDP^{-1}\text{.}\)
The eigenvalues of this matrix are complex so there is no such basis.
There is one eigenvalue \(\lambda=2\) with multiplicity two. The associated eigenspace \(E_2\) is one-dimensional so there is not a basis of \(\real^2\) consisting of eigenvectors.
Only the last matrix \(A=\begin{bmatrix}
9 \amp 2 \\
2 \amp 6
\end{bmatrix}\text{.}\)
We form an orthonormal basis by scaling the eigenvectors to have length 1. This gives \(Q = \begin{bmatrix}
2/\sqrt{5} \amp 1/\sqrt{5} \\
1/\sqrt{5} \amp -2/\sqrt{5} \\
\end{bmatrix}\text{,}\) which is orthogonal since the columns form an orthonormal basis of \(\real^2\text{.}\)
Orthogonal matrices are invertible and have \(Q^{-1}
= Q^{\transpose}\)
If \(A=QDQ^{\transpose}\text{,}\) we have \(A^{\transpose}=(QDQ^{\transpose})^{\transpose} =
(Q^{\transpose})^{\transpose}D^{\transpose}Q^{\transpose} = QDQ^{\transpose} = A\text{.}\) This means that the matrix is symmetric.
The examples in this activity illustrate a range of possibilities. First, a matrix may have complex eigenvalues, in which case it will not be diagonalizable. Second, even if all the eigenvalues are real, there may not be a basis of eigenvalues if the dimension of one of the eigenspaces is less than the algebraic multiplicity of the associated eigenvalue.
We are interested in matrices for which there is an orthogonal basis of eigenvectors. When this happens, we can create an orthonormal basis of eigenvectors by scaling each eigenvector in the basis so that its length is 1. Putting these orthonormal vectors into a matrix \(Q\) produces an orthogonal matrix, which means that \(Q^{\transpose}=Q^{-1}\text{.}\) We then have
\begin{equation*}
A = QDQ^{-1} = QDQ^{\transpose}.
\end{equation*}
In this case, we say that \(A\) is orthogonally diagonalizable.
Definition7.2.4.
If there is an orthonormal basis of \(\real^n\) consisting of eigenvectors of the matrix \(A\text{,}\) we say that \(A\) is orthogonally diagonalizable. In particular, we can write \(A=QDQ^{\transpose}\) where \(Q\) is an orthogonal matrix.
When \(A\) is orthogonally diagonalizable, notice that
\begin{equation*}
A^{\transpose}=(QDQ^{\transpose})^{\transpose} = (Q^{\transpose})^{\transpose}D^{\transpose}Q^{\transpose} = QDQ^{\transpose} = A.
\end{equation*}
That is, when \(A\) is orthogonally diagonalizable, \(A=A^{\transpose}\) and we say that \(A\) is symmetric.
Definition7.2.5.
A symmetric matrix \(A\) is one for which \(A=A^{\transpose}\text{.}\)
Example7.2.6.
Consider the matrix \(A =
\begin{bmatrix}
-2 \amp 36 \\
36 \amp -23
\end{bmatrix}
\text{,}\) which has eigenvectors \(\vvec_1 = \twovec43\text{,}\) with associated eigenvalue \(\lambda_1=25\text{,}\) and \(\vvec_2=\twovec{3}{-4}\text{,}\) with associated eigenvalue \(\lambda_2=-50\text{.}\) Notice that \(\vvec_1\) and \(\vvec_2\) are orthogonal so we can form an orthonormal basis of eigenvectors:
Notice also that, as expected, \(A\) is symmetric; that is, \(A=A^{\transpose}\text{.}\)
Example7.2.7.
If \(A = \begin{bmatrix}
1 \amp 2 \\
2 \amp 1 \\
\end{bmatrix}
\text{,}\) then there is an orthogonal basis of eigenvectors \(\vvec_1 = \twovec11\) and \(\vvec_2 =
\twovec{-1}1\) with eigenvalues \(\lambda_1=3\) and \(\lambda_2=-1\text{.}\) Using these eigenvectors, we form the orthogonal matrix \(Q\) consisting of eigenvectors and the diagonal matrix \(D\text{,}\) where
Notice that the matrix transformation represented by \(Q\) is a \(45^\circ\) rotation while that represented by \(Q^{\transpose}=Q^{-1}\) is a \(-45^\circ\) rotation. Therefore, if we multiply a vector \(\xvec\) by \(A\text{,}\) we can decompose the multiplication as
That is, we first rotate \(\xvec\) by \(-45^\circ\text{,}\) then apply the diagonal matrix \(D\text{,}\) which stretches and reflects, and finally rotate by \(45^\circ\text{.}\) We may visualize this factorization as in Figure 7.2.8.
In fact, a similar picture holds any time the matrix \(A\) is orthogonally diagonalizable.
We have seen that a matrix that is orthogonally diagonalizable must be symmetric. In fact, it turns out that any symmetric matrix is orthogonally diagonalizable. We record this fact in the next theorem.
Theorem7.2.9.The Spectral Theorem.
The matrix \(A\) is orthogonally diagonalizable if and only if \(A\) is symmetric.
Activity7.2.3.
Each of the following matrices is symmetric so the Spectral Theorem tells us that each is orthogonally diagonalizable. The point of this activity is to find an orthogonal diagonalization for each matrix.
To begin, find a basis for each eigenspace. Use this basis to find an orthogonal basis for each eigenspace and put these bases together to find an orthogonal basis for \(\real^m\) consisting of eigenvectors. Use this basis to write an orthogonal diagonalization of the matrix.
Consider the matrix \(A = B^{\transpose}B\) where \(B = \begin{bmatrix}
0 \amp 1 \amp 2 \\
2 \amp 0 \amp 1
\end{bmatrix}
\text{.}\) Explain how we know that \(A\) is symmetric and then find an orthogonal diagonalization of \(A\text{.}\)
We have eigenvectors \(\vvec_1=\twovec12\) and \(\vvec_2 = \twovec{2}{-1}\) with associated eigenvalues \(\lambda_1 = 4\) and \(\lambda_2=-1\text{.}\) We form an orthonormal basis of eigenvectors, \(\uvec_1=\twovec{1/\sqrt{5}}{2/\sqrt{5}}\) and \(\uvec_2=\twovec{2/\sqrt{5}}{-1/\sqrt{5}}\text{.}\) This gives
We have eigenvalues \(\lambda_1=10\) with associated eigenvector \(\vvec_1=\threevec211\) and \(\lambda_2 = 1\) with associated eigenvectors \(\vvec_2=\threevec10{-2}\) and \(\vvec_3=\threevec01{-2}\text{.}\) Notice that \(\vvec_1\) is orthogonal to both \(\vvec_2\) and \(\vvec_3\text{,}\) but \(\vvec_2\) and \(\vvec_3\) are not orthogonal to one another. We can, however, apply Gram-Schmidt to create an orthogonal basis of the eigenspace \(E_1\text{.}\) We can then form an orthonormal basis so that
We have \(A^{\transpose} = (B^{\transpose}B)^{\transpose} = B^{\transpose}(B^{\transpose})^{\transpose}=B^{\transpose}B = A\) so \(A\) must be symmetric. Then we find the orthogonal diagonalization
As the examples in Activity 7.2.3 illustrate, the Spectral Theorem implies a number of things. Namely, if \(A\) is a symmetric \(m\by m\) matrix, then
the eigenvalues of \(A\) are real.
there is a basis of \(\real^m\) consisting of eigenvectors.
two eigenvectors that are associated to different eigenvalues are orthogonal.
We won’t justify the first two facts here since that would take us rather far afield. However, it will be helpful to explain the third fact. To begin, notice the following:
Suppose a symmetric matrix \(A\) has eigenvectors \(\vvec_1\text{,}\) with associated eigenvalue \(\lambda_1=3\text{,}\) and \(\vvec_2\text{,}\) with associated eigenvalue \(\lambda_2 = 10\text{.}\) Notice that
which can only happen if \(\vvec_1\cdot\vvec_2 = 0\text{.}\) Therefore, \(\vvec_1\) and \(\vvec_2\) are orthogonal.
More generally, the same argument shows that any two eigenvectors of any symmetric matrix associated to distinct eigenvalues must be orthogonal.
Proposition7.2.12.
If \(A\) is symmetric, then any pair of eigenvectors for \(A\) with distinct eigenvalues are orthogonal.
That is, if \(\vvec_1\) and \(\vvec_2\) are eigenvectors associated with distinct eigenvalues \(\lambda_1 \neq \lambda_2\text{,}\) then \(\vvec \perp \vvec_2\text{.}\)
Subsection7.2.2Summary
This section explored both symmetric matrices. In particular, we saw that
A matrix \(A\) is orthogonally diagonalizable if there is an orthonormal basis of eigenvectors. In particular, we can write \(A=QDQ^{\transpose}\text{,}\) where \(D\) is a diagonal matrix of eigenvalues and \(Q\) is an orthogonal matrix of eigenvectors.
The Spectral Theorem tells us that a matrix \(A\) is orthogonally diagonalizable if and only if it is symmetric; that is, \(A=A^{\transpose}\text{.}\)
Since covariances matrices are symmetric, they are orthogonally diagonalizable.
Exercises7.2.3Exercises
1.
For each of the following matrices, find the eigenvalues and a basis for each eigenspace. Determine whether the matrix is diagonalizable and, if so, find a diagonalization. Determine whether the matrix is orthogonally diagonalizable and, if so, find an orthogonal diagonalization.
Suppose that \(\uvec\) is an eigenvector of \(B\) with associated eigenvalue \(\lambda\) and that \(\uvec\) has unit length. Explain why \(\lambda =
\len{A\uvec}^2\text{.}\)
Explain why the eigenvalues of \(B\) are nonnegative.
If \(S\) is the covariance matrix with a column-variate data matrix \(X\text{,}\) explain why the eigenvalues of \(S\) are nonnegative.
5.
Determine whether the following statements are true or false and explain your thinking.
If \(A\) is an invertible, orthogonally diagonalizable matrix, then so is \(A^{-1}\text{.}\)
If \(\lambda=2+i\) is an eigenvalue of \(A\text{,}\) then \(A\) cannot be orthogonally diagonalizable.
If there is a basis for \(\real^m\) consisting of eigenvectors of \(A\text{,}\) then \(A\) is orthogonally diagonalizable.
If \(\uvec\) and \(\vvec\) are eigenvectors of a symmetric matrix associated to eigenvalues -2 and 3, then \(\uvec\cdot\vvec=0\text{.}\)
If \(A\) is a square matrix, then \(\uvec\cdot(A\vvec) = (A\uvec)\cdot\vvec\text{.}\)
6.
Suppose that \(A\) is a noninvertible, symmetric \(3\by3\) matrix having eigenvectors
and associated eigenvalues \(\lambda_1 = 20\) and \(\lambda_2 = -4\text{.}\) Find matrices \(Q\) and \(D\) such that \(A =
QDQ^{\transpose}\text{.}\)
7.
Suppose that \(W\) is a plane in \(\real^3\) and that \(P\) is the \(3\by3\) matrix that projects vectors orthogonally onto \(W\text{.}\)
Explain why \(P\) is orthogonally diagonalizable.
What are the eigenvalues of \(P\text{?}\)
Explain the relationship between the eigenvectors of \(P\) and the plane \(W\text{.}\)
8.
Prove Proposition 7.2.12. That is, show that if \(\vvec_1\) and \(\vvec_2\) are eigenvectors for a symmteric matrix \(A\) associated with distinct eigenvalues \(\lambda_1 \neq \lambda_2\text{,}\) then \(\vvec \perp \vvec_2\text{.}\)