The preview activity illustrates the main idea behind an algorithm, known as
Gram-Schmidt orthogonalization, that begins with a basis for some subspace of
\(\real^m\) and produces an orthogonal or orthonormal basis. The algorithm relies on our construction of the orthogonal projection. Remember that we formed the orthogonal projection
\(\bhat\) of
\(\bvec\) onto a subspace
\(W\) by requiring that
\(\bvec-\bhat\) is orthogonal to
\(W\) as shown in
Figure 6.4.3.
This observation guides our construction of an orthogonal basis for it allows us to create a vector that is orthogonal to a given subspace. Let’s see how the Gram-Schmidt algorithm works.
Activity 6.4.2.
Suppose that \(W\) is a three-dimensional subspace of \(\real^4\) with basis:
\begin{equation*}
\vvec_1 = \fourvec1111,\hspace{24pt}
\vvec_2 = \fourvec1322,\hspace{24pt}
\vvec_3 = \fourvec1{-3}{-3}{-3}\text{.}
\end{equation*}
We can see that this basis is not orthogonal by noting that \(\vvec_1\cdot\vvec_2 = 8\text{.}\) Our goal is to create an orthogonal basis \(\wvec_1\text{,}\) \(\wvec_2\text{,}\) and \(\wvec_3\) for \(W\text{.}\)
To begin, we declare that \(\wvec_1=\vvec_1\text{,}\) and we call \(W_1\) the line defined by \(\wvec_1\text{:}\)
\begin{equation*}
\wvec_2 = \vvec_2 - \proj{\vvec_2}{\wvec_1}\text{.}
\end{equation*}
Find the vector \(\vhat_2 = \proj{\vvec_2}{\vvec_1}\text{.}\)
Form the vector \(\wvec_2 = \vvec_2-\vhat_2\) and verify that it is orthogonal to \(\wvec_1\text{.}\)
Explain why \(\laspan{\wvec_1,\wvec_2} =
\laspan{\vvec_1,\vvec_2}\) by showing that any linear combination of \(\vvec_1\) and \(\vvec_2\) can be written as a linear combination of \(\wvec_1\) and \(\wvec_2\) and vice versa.
The vectors \(\wvec_1\) and \(\wvec_2\) are an orthogonal basis for a two-dimensional subspace \(W_2\) of \(\real^4\text{.}\) Find the vector \(\vhat_3\) that is the orthogonal projection of \(\vvec_3\) onto \(W_2\text{.}\)
Verify that \(\wvec_3 = \vvec_3-\vhat_3\) is orthogonal to both \(\wvec_1\) and \(\wvec_2\text{.}\)
Explain why \(\wvec_1\text{,}\) \(\wvec_2\text{,}\) and \(\wvec_3\) form an orthogonal basis for \(W\text{.}\)
Now find an orthonormal basis for \(W\text{.}\)
Answer.
\(\widehat{\vvec}_2 = \fourvec2222\text{.}\)
\(\displaystyle \wvec_2 = \fourvec{-1}100\)
We have \(\vvec_1 = \wvec_1\) and \(\vvec_2 =
\wvec_2 + 2\wvec_1\) so a linear combination of \(\vvec_1\) and \(\vvec_1\) can be rewritten as a linear combination of \(\wvec_1\) and \(\wvec_2\text{.}\)
\(\widehat{\vvec}_3 =
\fourvec0{-4}{-2}{-2}\text{.}\)
\(\displaystyle \wvec_3 = \fourvec11{-1}{-1}\)
Since \(\vvec_i\) can be written in terms of \(\wvec_j\) and vice-versa, these new vectors form a basis for \(W\text{.}\)
\begin{equation*}
\uvec_1=\fourvec{1/2}{1/2}{1/2}{1/2},\hspace{24pt}
\uvec_2=\fourvec{-1/\sqrt{2}}{1/\sqrt{2}}00,\hspace{24pt}
\uvec_3=\fourvec{1/2}{1/2}{-1/2}{-1/2}
\end{equation*}
Solution.
\(\widehat{\vvec}_2 = \frac{\vvec_2\cdot\wvec_1}
{\wvec_1\cdot\wvec_1}\wvec_1=\fourvec2222\text{.}\)
\(\displaystyle \wvec_2 = \fourvec{-1}100\)
We have \(\vvec_1 = \wvec_1\) and \(\vvec_2 =
\wvec_2 + 2\wvec_1\text{.}\) Therefore, a linear combination of \(\vvec_1\) and \(\vvec_1\) can be rewritten as
\begin{equation*}
c_1\vvec_1+c_2\vvec_2 =
c_1\wvec_1+c_2(\wvec_2+2\wvec_1) =
(c_1+2c_2)\wvec_1 + c_2\wvec_2\text{.}
\end{equation*}
In the same way, \(\wvec_1=\vvec_1\) and \(\wvec_2
= \vvec_2-2\vvec_1\) so any linear combination of \(\wvec_1\) and \(\wvec_2\) can be rewritten as a linear combination of \(\vvec_1\) and \(\vvec_1\text{.}\)
By the Projection Formula, \(\widehat{\vvec}_3 =
\fourvec0{-4}{-2}{-2}\text{.}\)
\(\displaystyle \wvec_3 = \fourvec11{-1}{-1}\)
We can check that \(\wvec_1\text{,}\) \(\wvec_2\text{,}\) and \(\wvec_3\) form an orthogonal set. Since \(\vvec_i\) can be written in terms of \(\wvec_j\) and vice-versa, these new vectors form a basis for \(W\text{.}\)
\begin{equation*}
\uvec_1=\fourvec{1/2}{1/2}{1/2}{1/2},\hspace{24pt}
\uvec_2=\fourvec{-1/\sqrt{2}}{1/\sqrt{2}}00,\hspace{24pt}
\uvec_3=\fourvec{1/2}{1/2}{-1/2}{-1/2}
\end{equation*}
As this activity illustrates, Gram-Schmidt orthogonalization begins with a basis \(\vvec_1, \vvec_2,\ldots,\vvec_n\) for a subspace \(W\) of \(\real^m\) and creates an orthogonal basis for \(W\text{.}\) Let’s work through a second example.
Example 6.4.4.
Let’s start with the basis
\begin{equation*}
\vvec_1=\threevec{2}{-1}2,\hspace{24pt}
\vvec_2=\threevec{-3}{3}0,\hspace{24pt}
\vvec_3=\threevec{-2}71\text{,}
\end{equation*}
which is a basis for \(\real^3\text{.}\)
To get started, we’ll simply set \(\wvec_1=\vvec_1=\threevec{2}{-1}2\text{.}\) We construct \(\wvec_2\) from \(\vvec_2\) by subtracting its orthogonal projection onto \(W_1\text{,}\) the line defined by \(\wvec_1\text{:}\)
\begin{equation*}
\wvec_2 = \vvec_2 - \proj{\vvec_2}{\wvec_1}
\end{equation*}
This gives
\begin{equation*}
\wvec_2 = \vvec_2 -
\frac{\vvec_2\cdot\wvec_1}{\wvec_1\cdot\wvec_1}\wvec_1 =
\vvec_2 + \wvec_1 = \threevec{-1}22\text{.}
\end{equation*}
Notice that we found \(\vvec_2 = -\wvec_1 + \wvec_2\text{.}\) Therefore, we can rewrite any linear combination of \(\vvec_1\) and \(\vvec_2\) as
\begin{equation*}
c_1\vvec_1 + c_2\vvec_2 = c_1\wvec_1 + c_2(-\wvec_1+\wvec_2)
= (c_1-c_2)\wvec_1 + c_2\wvec_2\text{,}
\end{equation*}
a linear combination of \(\wvec_1\) and \(\wvec_2\text{.}\) This tells us that
\begin{equation*}
W_2 = \laspan{\wvec_1,\wvec_2} =
\laspan{\vvec_1,\vvec_2}\text{.}
\end{equation*}
In other words, \(\wvec_1\) and \(\wvec_2\) is a orthogonal basis for \(W_2\text{,}\) the 2-dimensional subspace that is the span of \(\vvec_1\) and \(\vvec_2\text{.}\)
Finally, we form \(\wvec_3\) from \(\vvec_3\) by subtracting its orthogonal projection onto \(W_2\text{:}\)
\begin{equation*}
\wvec_3 = \vvec_3 -
\frac{\vvec_3\cdot\wvec_1}{\wvec_1\cdot\wvec_1}\wvec_1 -
\frac{\vvec_3\cdot\wvec_2}{\wvec_2\cdot\wvec_2}\wvec_1
= \vvec_3 + \wvec_1 - 2\wvec_2
= \threevec22{-1}\text{.}
\end{equation*}
We can now check that
\begin{equation*}
\wvec_1=\threevec2{-1}2,\hspace{24pt}
\wvec_2=\threevec{-1}22,\hspace{24pt}
\wvec_3=\threevec22{-1},\hspace{24pt}
\end{equation*}
is an orthogonal set. Furthermore, we have, as before, \(\laspan{\wvec_1,\wvec_2,\wvec_3}
= \laspan{\vvec_1,\vvec_2,\vvec_3}\text{,}\) which says that we have found a new orthogonal basis for \(\real^3\text{.}\)
To create an orthonormal basis, we form unit vectors parallel to each of the vectors in the orthogonal basis:
\begin{equation*}
\uvec_1 = \threevec{2/3}{-1/3}{2/3},\hspace{24pt}
\uvec_2 = \threevec{-1/3}{2/3}{2/3},\hspace{24pt}
\uvec_3 = \threevec{2/3}{2/3}{-1/3}\text{.}
\end{equation*}
More generally, if we have a basis \(\vvec_1,\vvec_2,\ldots,\vvec_n\) for a subspace \(W\) of \(\real^m\text{,}\) the Gram-Schmidt algorithm creates an orthogonal basis for \(W\) in the following way:
\begin{align*}
\wvec_1 \amp = \vvec_1\\
\wvec_2 \amp = \vvec_2 -
\frac{\vvec_2\cdot\wvec_1}{\wvec_1\cdot\wvec_1}\wvec_1\\
\wvec_3 \amp = \vvec_3 -
\frac{\vvec_3\cdot\wvec_1}{\wvec_1\cdot\wvec_1}\wvec_1 -
\frac{\vvec_3\cdot\wvec_2}{\wvec_2\cdot\wvec_2}\wvec_2\\
\amp \vdots\\
\wvec_n \amp = \vvec_n -
\frac{\vvec_n\cdot\wvec_1}{\wvec_1\cdot\wvec_1}\wvec_1 -
\frac{\vvec_n\cdot\wvec_2}{\wvec_2\cdot\wvec_2}\wvec_2 -
\ldots -
\frac{\vvec_n\cdot\wvec_{n-1}}
{\wvec_{n-1}\cdot\wvec_{n-1}}\wvec_{n-1}\text{.}
\end{align*}
From here, we may form an orthonormal basis by constructing a unit vector parallel to each vector in the orthogonal basis: \(\uvec_j = 1/\len{\wvec_j}~\wvec_j\text{.}\)
Activity 6.4.3.
Python can automate these computations for us, and we’ll learn a standard numpy way of approaching this shortly. But first let’s see how we can code up the Gram-Schmidt process directly. The code below allows vectors to be given as 1-d arrays or as 1-column matrices. flatten()
is used to convert either to the 1-d representation. The basis is provided as a python list of vectors (in either representation).
Let’s now consider \(W\text{,}\) the subspace of \(\real^5\) having basis
\begin{equation*}
\vvec_1 = \fivevec{14}{-6}{8}2{-6},\hspace{24pt}
\vvec_2 = \fivevec{5}{-3}{4}3{-7},\hspace{24pt}
\vvec_3 = \fivevec{2}30{-2}1.
\end{equation*}
-
Apply the Gram-Schmidt algorithm to find an orthogonal basis \(\wvec_1\text{,}\) \(\wvec_2\text{,}\) and \(\wvec_3\) for \(W\text{.}\)
Find \(\bhat\text{,}\) the orthogonal projection of \(\bvec =
\fivevec{-5}{11}0{-1}5\) onto \(W\text{.}\)
Explain why we know that \(\bhat\) is a linear combination of the original vectors \(\vvec_1\text{,}\) \(\vvec_2\text{,}\) and \(\vvec_3\) and then find weights so that
\begin{equation*}
\bhat = c_1\vvec_1 + c_2\vvec_2 + c_3\vvec_3.
\end{equation*}
-
Find an orthonormal basis \(\uvec_1\text{,}\) \(\uvec_2\text{,}\) for \(\uvec_3\) for \(W\) and form the matrix \(Q\) whose columns are these vectors.
Find the product \(Q^{\transpose}Q\) and explain the result.
Find the matrix \(P\) that projects vectors orthogonally onto \(W\) and verify that \(P\bvec\) gives \(\bhat\text{,}\) the orthogonal projection that you found earlier.
Answer.
\begin{equation*}
\wvec_1=\fivevec{14}{-6}82{-6},\hspace{24pt}
\wvec_2=\fivevec{-2}{0}02{-4},\hspace{24pt}
\wvec_3=\fivevec130{-1}{-1}
\end{equation*}
\(\displaystyle \bhat=\fivevec{-4}9{-4}{-4}3\)
\(\bhat=-\frac34\vvec_1+\frac12\vvec_2+2\vvec_3\text{.}\)
\begin{equation*}
\uvec_1=\fivevec{14/\sqrt{336}}{-6\sqrt{336}}{8/\sqrt{336}}
{2/\sqrt{336}}{-6/\sqrt{336}},\hspace{24pt}
\uvec_2=\fivevec{-2/\sqrt{24}}{0}0{2/\sqrt{24}}
{-4/\sqrt{24}},\hspace{24pt}
\uvec_3=\fivevec{1/\sqrt{12}}{3/\sqrt{12}}0
{-1/\sqrt{12}}{-1/\sqrt{12}}
\end{equation*}
\(\displaystyle Q^{\transpose}Q=I\)
\(\displaystyle P = QQ^{\transpose} = \begin{bmatrix}
5/6 \amp 0 \amp 1/3 \amp -1/6 \amp 0 \\
0 \amp 6/7 \amp -1/7 \amp -2/7 \amp -1/7 \\
1/3 \amp -1/7 \amp 4/21 \amp 1/21 \amp -1/7 \\
-1/6 \amp -2/7 \amp 1/21 \amp 11/42 \amp -2/7 \\
0 \amp -1/7 \amp -1/7 \amp -2/7 \amp 6/7
\end{bmatrix}\)
Solution.
\begin{equation*}
\wvec_1=\fivevec{14}{-6}82{-6},\hspace{24pt}
\wvec_2=\fivevec{-2}{0}02{-4},\hspace{24pt}
\wvec_3=\fivevec130{-1}{-1}
\end{equation*}
\(\displaystyle \bhat=\fivevec{-4}9{-4}{-4}3\)
We know that \(\bhat\) is in \(W\) and \(\vvec_1\text{,}\) \(\vvec_2\text{,}\) and \(\vvec_3\) is a basis for \(W\text{.}\) We find \(\bhat=-\frac34\vvec_1+\frac12\vvec_2+2\vvec_3\text{.}\)
\begin{equation*}
\uvec_1=\fivevec{14/\sqrt{336}}{-6\sqrt{336}}{8/\sqrt{336}}
{2/\sqrt{336}}{-6/\sqrt{336}},\hspace{24pt}
\uvec_2=\fivevec{-2/\sqrt{24}}{0}0{2/\sqrt{24}}{-4/\sqrt{24}},\hspace{24pt}
\uvec_3=\fivevec{1/\sqrt{12}}{3/\sqrt{12}}0{-1/\sqrt{12}}{-1/\sqrt{12}}
\end{equation*}
\(Q^{\transpose}Q=I\) since the matrix product computes the dot products of the columns of \(Q\text{.}\)
\(\displaystyle P = QQ^{\transpose} = \begin{bmatrix}
5/6 \amp 0 \amp 1/3 \amp -1/6 \amp 0 \\
0 \amp 6/7 \amp -1/7 \amp -2/7 \amp -1/7 \\
1/3 \amp -1/7 \amp 4/21 \amp 1/21 \amp -1/7 \\
-1/6 \amp -2/7 \amp 1/21 \amp 11/42 \amp -2/7 \\
0 \amp -1/7 \amp -1/7 \amp -2/7 \amp 6/7
\end{bmatrix}\)