Computational Linear Algebra

Section 2.3 Computational Linear Algebra

Linear algebra owes its prominence as a powerful scientific tool to the ever-growing power of computers. Carl Cowen, a former president of the Mathematical Association of America, has said, “No serious application of linear algebra happens without a computer.” Indeed, Cowen notes that, in the 1950s, working with a system of 100 equations in 100 variables was difficult. Today, scientists (including data scientists) routinely work on problems that are vastly larger. This is only possible because of today’s computing power.

For learning the principles of linear algebra, small examples in 2 or 3 dimensions are useful because (a) we can visualize things geometrically, and (b) we can see all the numbers and computations involved. But as we build our understanding and intuition with these small examples, we also want to be learning how to do linear algebra computationally so that we can work with much larger examples.

There are many computational tools that could be used. One of them, Matlab (and its open source analog, Octave), was specifically designed (and named) with matrix algebra as its primary use case. Linear algebra is important for statistics and machine learning, but so are other things, like data wrangling and visualing data and models. For this reason, languages like R and Python are much more important in the data science community. Although linear algebra is used extensively in both languages, end users of Python are more likely to manipulate data in vector and matrix format than are users of R, where this is often hidden behind an additional layer of abstraction. For this reason, we’ll focus primarily on Python in this text, with some references to how things differ in R thrown in along the way.

Subsection 2.3.1 Reduced row echelon form in Python

When we encounter a matrix, Theorem 2.2.6 tells us that there is exactly one reduced row echelon matrix that is row equivalent to it.

In fact, the uniqueness of this reduced row echelon matrix is what motivates us to define this particular form. When solving a system of linear equations using Gaussian elimination, there are other row equivalent matrices that reveal the structure of the solution space. The reduced row echelon matrix is simply a convenience as it is an agreement we make with one another to seek the same matrix.

An added benefit is that we can ask a computer to find reduced row echelon matrices for us. We will learn how to do this now that we have a little familiarity with Python.

In addition to numpy, we will use an additional package called sympy.

Note 2.3.1. Augmented matrices in numpy.

Notice that in numpy there is no way of specifying whether a matrix is an augmented matrix or a coefficient matrix, so it will be up to us to interpret our results appropriately.

Python syntax.

Some common mistakes when entering a matrix in Python include

forgetting the square brackets around the list of entries,
forgetting an entry from the list or to add an extra one,
forgetting to separate the rows, and entries within a row with commas, and
forgetting the closing parenthesis (because you are so worried about getting the correct number of square brackets, perhaps).

If you see an error message, carefully proofread your input and try again.

Activity 2.3.1. Using Python to find row reduced echelon matrices.

Enter the following matrix into Python.

\begin{equation*} \left[ \begin{array}{rrrr} -1 \amp -2 \amp 2 \amp -1 \\ 2 \amp 4 \amp -1 \amp 5 \\ 1 \amp 2 \amp 0 \amp 3 \end{array} \right] \end{equation*}
Give the matrix the name \(A\) by entering
```
A = np.array([ ... ])
	      
```
numpy does not have a built-in method for computing the reduced row echelon form of a matrix, so we will use another package, sympy. We may then find its reduced row echelon form by entering
```
import sympy 
A = np.array([ ... ])
sympy.Matrix(A).rref()
	      
```
Note that we must first convert our numpy array into a sympy matrix. A common mistake is to forget the parentheses after rref.
Use Python to find the reduced row echelon form of the matrix from Item a of this activity.
Notice that rref() returns a tuple with two elements. The first element is the reduced row echelon form of our matrix. See if you can figure out what the second element is.
Use Python to describe the solution space of the system of linear equations

\begin{equation*} \begin{alignedat}{5} -x_1 \amp \amp \amp \amp \amp {}+{} \amp 2x_4 \amp {}={} \amp 4 \\ \amp \amp 3x_2 \amp {}+{} \amp x_3 \amp {}+{} \amp 2x_4 \amp {}={} \amp 3 \\ 4x_1 \amp {}-{} \amp 3x_2 \amp \amp \amp {}+{} \amp x_4 \amp {}={} \amp 14 \\ \amp \amp 2x_2 \amp {}+{} \amp 2x_3 \amp {}+{} \amp x_4 \amp {}={} \amp 1 \\ \end{alignedat} \end{equation*}
Consider the two matrices:

\begin{equation*} \begin{array}{rcl} A \amp = \amp \left[ \begin{array}{rrrr} 1 \amp -2 \amp 1 \amp -3 \\ -2 \amp 4 \amp 1 \amp 1 \\ -4 \amp 8 \amp -1 \amp 7 \\ \end{array}\right] \\ B \amp = \amp \left[ \begin{array}{rrrrrr} 1 \amp -2 \amp 1 \amp -3 \amp 0 \amp 3 \\ -2 \amp 4 \amp 1 \amp 1 \amp 1 \amp -1 \\ -4 \amp 8 \amp -1 \amp 7 \amp 3 \amp 2 \\ \end{array}\right] \\ \end{array} \end{equation*}

We say that \(B\) is an augmentation of \(A\) because it is obtained from \(A\) by adding some more columns.

We can use np.column_stack() to augment matrices as shown below. Now compute the reduced row echelon forms for \(A\) and \(B\text{.}\) What do you notice about the relationship between the two reduced row echelon forms?
Using the system of equations in Item d, write the augmented matrix corresponding to the system of equations. What did you find for the reduced row echelon form of the augmented matrix?

Now write the coefficient matrix of this system of equations. What does Item e of this activity tell you about its reduced row echelon form?

Answer.

np.array([[-1,-2, 2,-1],
          [ 2, 4,-1, 5],
          [ 1, 2, 0, 3]])

The reduced row echelon form of the matrix is

\begin{equation*} \begin{bmatrix} 1 \amp 2 \amp 0 \amp 3 \\ 0 \amp 0 \amp 1 \amp 1 \\ 0 \amp 0 \amp 0 \amp 0 \end{bmatrix}. \end{equation*}
There is a unique solution \((x_1,x_2,x_3,x_4) = (-2,-1,0,3)\text{.}\)
The first four columns of the reduced row echelon form of \(B\) form the reduced row echelon form of \(A\text{.}\)
The reduced row echelon form of the coefficient matrix is

\begin{equation*} \begin{bmatrix} 1 \amp 0 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \amp 0 \\ 0 \amp 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 0 \amp 1 \\ \end{bmatrix}. \end{equation*}

Solution.

np.array([[-1,-2, 2,-1],
          [ 2, 4,-1, 5],
          [ 1, 2, 0, 3]])

The reduced row echelon form of the matrix is

\begin{equation*} \begin{bmatrix} 1 \amp 2 \amp 0 \amp 3 \\ 0 \amp 0 \amp 1 \amp 1 \\ 0 \amp 0 \amp 0 \amp 0 \end{bmatrix}. \end{equation*}
Python tells us that the reduced row echelon form of the corresponding augmented matrix is

\begin{equation*} \begin{bmatrix} 1 \amp 0 \amp 0 \amp 0 \amp -2 \\ 0 \amp 1 \amp 0 \amp 0 \amp -1 \\ 0 \amp 0 \amp 1 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp 1 \amp 3 \\ \end{bmatrix} \end{equation*}

so there is a unique solution \((x_1,x_2,x_3,x_4) = (-2,-1,0,3)\text{.}\)
The first four columns of the reduced row echelon form of \(B\) form the reduced row echelon form of \(A\text{.}\)
The reduced row echelon form of the coefficient matrix is

\begin{equation*} \begin{bmatrix} 1 \amp 0 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \amp 0 \\ 0 \amp 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 0 \amp 1 \\ \end{bmatrix}. \end{equation*}

This can be computed with
(0,2) tells us that the leading 1 in the row 0 is in column 0 and that the leading 1 in row 1 is in column 2. This uses 0-based indexing. We might more naturally say that the leading 1 in the first row is in the first column and the leading 1 in the second row is in the third column.

The last part of the previous activity, Item e, demonstrates something that will be helpful for us in the future. In that activity, we started with a matrix \(A\text{,}\) which we augmented by adding some columns to obtain a matrix \(B\text{.}\) We then noticed that the reduced row echelon form of \(B\) is itself an augmentation of the reduced row echelon form of \(A\text{.}\)

To illustrate, we can consider the reduced row echelon form of the augmented matrix:

\begin{equation*} \left[ \begin{array}{ccc|c} -2 \amp 3 \amp 0 \amp 2 \\ -1 \amp 4 \amp 1 \amp 3 \\ 3 \amp 0 \amp 2 \amp 2 \\ 1 \amp 5 \amp 3 \amp 7 \\ \end{array} \right] \sim \left[ \begin{array}{ccc|c} 1 \amp 0 \amp 0 \amp -4 \\ 0 \amp 1 \amp 0 \amp -2 \\ 0 \amp 0 \amp 1 \amp 7 \\ 0 \amp 0 \amp 0 \amp 0 \\ \end{array} \right] \end{equation*}

We can then determine the reduced row echelon form of the coefficient matrix by looking inside the augmented matrix.

\begin{equation*} \left[ \begin{array}{ccc} -2 \amp 3 \amp 0 \\ -1 \amp 4 \amp 1 \\ 3 \amp 0 \amp 2 \\ 1 \amp 5 \amp 3 \\ \end{array} \right] \sim \left[ \begin{array}{ccc} 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \\ 0 \amp 0 \amp 0 \\ \end{array} \right] \end{equation*}

If we trace through the steps in the Gaussian elimination algorithm carefully, we see that this is a general principle, which we now state.

Proposition 2.3.2. Augmentation Principle.

If matrix \(B\) is an augmentation of matrix \(A\text{,}\) then the reduced row echelon form of \(B\) is an augmentation of the reduced row echelon form of \(A\text{.}\)

Subsection 2.3.2 np.linalg.solve()

We mentioned that numpy (and scipy) do not include methods to compute the reduced row echelon form of a matrix. There are other utilities in those packages that can help us acheive many of the same goals. One example is np.linalg.solve() which will find the solution to an equation \(A \xvec = \bvec\) provided there is a unique solution. (Otherwise it throws an error.)

Subsection 2.3.3 Computational effort

At the beginning of this section, we indicated that linear algebra has become more prominent as computers have grown more powerful. Computers, however, still have limits. Let’s consider how much effort is expended when we ask to find the reduced row echelon form of a matrix. We will measure, very roughly, the effort by the number of times the algorithm requires us to multiply or add two numbers.

We will assume that our matrix has the same number of rows as columns, which we call \(n\text{.}\) We are mainly interested in the case when \(n\) is very large, which is when we need to worry about how much effort is required.

Let’s first consider the effort required for each of our row operations.

Scaling a row multiplies each of the \(n\) entries in a row by some number, which requires \(n\) operations.
Interchanging two rows requires no multiplications or additions so we won’t worry about the effort required by an interchange.
A replacement requires us to multiply each entry in a row by some number, which takes \(n\) operations, and then add the resulting entries to another row, which requires another \(n\) operations. The total number of operations is \(2n\text{.}\)

Our goal is to transform a matrix to its reduced row echelon form, which looks something like this:

\begin{equation*} \left[ \begin{array}{cccc} 1 \amp 0 \amp \ldots \amp 0 \\ 0 \amp 1 \amp \ldots \amp 0 \\ \vdots \amp \vdots \amp \ddots \amp 0 \\ 0 \amp 0 \amp \ldots \amp 1 \\ \end{array} \right]\text{.} \end{equation*}

We roughly perform one replacement operation for every 0 entry in the reduced row echelon matrix. When \(n\) is very large, most of the \(n^2\) entries in the reduced row echelon form are 0 so we need roughly \(n^2\) replacements. Since each replacement operation requires \(2n\) operations, the number of operations resulting from the needed replacements is roughly \(n^2(2n) = 2n^3\text{.}\)

Each row is scaled roughly one time so there are roughly \(n\) scaling operations, each of which requires \(n\) operations. The number of operations due to scaling is roughly \(n^2\text{.}\)

Therefore, the total number of operations is roughly

\begin{equation*} 2n^3 + n^2\text{.} \end{equation*}

When \(n\) is very large, the \(n^2\) term is much smaller than the \(n^3\) term. We therefore state that

Observation 2.3.3.

The number of operations required to find the reduced row echelon form of an \(n\by n\) matrix is roughly proportional to \(n^3\text{.}\)

This is a very rough measure of the effort required to find the reduced row echelon form; a more careful accounting shows that the number of arithmetic operations is roughly \(\frac23 n^3\text{.}\) As we have seen, some matrices require more effort than others, but the upshot of this observation is that the effort is proportional to \(n^3\text{.}\) We can think of this in the following way: If the size of the matrix grows by a factor of 10, then the effort required grows by a factor of \(10^3 = 1000\text{.}\)

While today’s computers are powerful, they cannot handle every problem we might ask of them. Eventually, we would like to be able to consider matrices that have \(n=10^{12}\) (a trillion) rows and columns. In very broad terms, the effort required to find the reduced row echelon matrix will require roughly \((10^{12})^3 = 10^{36}\) operations.

To put this into context, imagine we need to solve a linear system with a trillion equations and a trillion variables and that we have a computer that can perform a trillion, \(10^{12}\text{,}\) operations every second. Finding the reduced row echelon form would take about \(10^{16}\) years. At this time, the universe is estimated to be approximately \(10^{10}\) years old. If we had started the calculation \(10^{10}\) years ago, then we’d be about one-millionth of the way through.

This may seem like an absurd situation, but we’ll see in Subsection 5.5.3 how we use the results of such a computation every day. Clearly, we will need some better tools to deal with really big problems like this one.

Subsection 2.3.4 Summary

We learned some basic more features of Python with an emphasis on finding the reduced row echelon form of a matrix.

Our preference is to use numpy.array() to create matrices.

We can find the reduced row echelon form using the rref() method after converting to a sympy.Matrix. The output is a tuple of length 2. The first element is the reduced row echelon form of the matrix and the second is a tuple identifying the location of the pivots -- the topic of the next section.
We saw an example of the Augmentation Principle, which we then stated as a general principle.
We saw that the computational effort required to find the reduced row echelon form of an \(n\by n\) matrix is proportional to \(n^3\text{.}\)

Appendix A contains a reference outlining the Python commands that we have encountered.

Exercises 2.3.5 Exercises

1.

Consider the linear system

\begin{equation*} \begin{alignedat}{4} x \amp {}+{} \amp 2y \amp {}-{} \amp z \amp {}={} \amp 1 \\ 3x \amp {}+{} \amp 2y \amp {}+{} \amp 2z \amp {}={} \amp 7 \\ -x \amp \amp \amp {}+{} \amp 4z \amp {}={} \amp -3 \\ \end{alignedat} \end{equation*}

Write this system as an augmented matrix and use Python to find a description of the solution space.

2.

Consider the vectors

\begin{equation*} \vvec_1 = \left[\begin{array}{r} 2 \\ 1 \end{array} \right], \vvec_2 = \left[\begin{array}{r} -1 \\ 1 \end{array} \right], \vvec_3 = \left[\begin{array}{r} -2 \\ 0 \end{array} \right] \end{equation*}

Find the linear combination with weights \(c_1 = 2\text{,}\) \(c_2=-3\text{,}\) and \(c_3=1\text{.}\)
Can you write the vector \({\mathbf 0} = \left[\begin{array}{r} 0 \\ 0 \end{array}\right]\) as a linear combination of \(\vvec_1\text{,}\) \(\vvec_2\text{,}\) and \(\vvec_3\text{?}\) If so, describe all the ways in which you can do so.
Can you write the vector \({\mathbf 0} = \left[\begin{array}{r} 0 \\ 0 \end{array}\right]\) as a linear combination using just the first two vectors \(\vvec_1\) \(\vvec_2\text{?}\) If so, describe all the ways in which you can do so.
Can you write \(\vvec_3\) as a linear combination of \(\vvec_1\) and \(\vvec_2\text{?}\) If so, in how many ways?

3.

Consider the vectors

\begin{equation*} \vvec_1 = \left[\begin{array}{r} 2 \\ -1 \\ -2 \end{array}\right], \vvec_2 = \left[\begin{array}{r} 0 \\ 3 \\ 1 \end{array}\right], \vvec_3 = \left[\begin{array}{r} 4 \\ 4 \\ -2 \end{array}\right]\text{.} \end{equation*}

Can you express the vector \(\bvec=\left[\begin{array}{r} 10 \\ 1 \\ -8 \end{array}\right]\) as a linear combination of \(\vvec_1\text{,}\) \(\vvec_2\text{,}\) and \(\vvec_3\text{?}\) If so, describe all the ways in which you can do so.
Can you express the vector \(\bvec=\left[\begin{array}{r} 3 \\ 7 \\ 1 \end{array}\right]\) as a linear combination of \(\vvec_1\text{,}\) \(\vvec_2\text{,}\) and \(\vvec_3\text{?}\) If so, describe all the ways in which you can do so.
Show that \(\vvec_3\) can be written as a linear combination of \(\vvec_1\) and \(\vvec_2\text{.}\)
Explain why any linear combination of \(\vvec_1\text{,}\) \(\vvec_2\text{,}\) and \(\vvec_3\text{,}\)

\begin{equation*} a\vvec_1 + b\vvec_2 + c\vvec_3, \end{equation*}

can be rewritten as a linear combination of just \(\vvec_1\) and \(\vvec_2\text{.}\)

4.

Consider the vectors

\begin{equation*} \vvec_1=\left[\begin{array}{r} 3 \\ -1 \\ 1 \end{array}\right], \vvec_2=\left[\begin{array}{r} 1 \\ 1 \\ 2 \end{array}\right]. \end{equation*}

For what value(s) of \(k\text{,}\) if any, can the vector \(\left[\begin{array}{r} k \\ -2 \\ 5 \end{array}\right]\) be written as a linear combination of \(\vvec_1\) and \(\vvec_2\text{?}\)

5.

A theme that will later unfold concerns the use of coordinate systems. We can identify the point \((x,y)\) with the tip of the vector \(\left[\begin{array}{r}x\\y\end{array}\right]\text{,}\) drawn emanating from the origin. We can then think of the usual Cartesian coordinate system in terms of linear combinations of the vectors

\begin{equation*} \evec_1 = \left[\begin{array}{r} 1 \\ 0 \end{array}\right], \evec_2 = \left[\begin{array}{r} 0 \\ 1 \end{array}\right]. \end{equation*}

For instance, the point \((2,-3)\) is identified with the vector

\begin{equation*} \left[\begin{array}{r} 2 \\ -3 \end{array}\right] = 2\evec_1 - 3\evec_2, \end{equation*}

as shown on the left in Figure 2.3.4.

Figure 2.3.4. The usual Cartesian coordinate system, defined by the vectors \(\evec_1\) and \(\evec_2\text{,}\) is shown on the left along with the representation of the point \((2,-3)\text{.}\) The right shows a nonstandard coordinate system defined by vectors \(\vvec_1\) and \(\vvec_2\text{.}\)

If instead we have vectors

\begin{equation*} \vvec_1 = \left[\begin{array}{r} 2 \\ 1 \end{array}\right], \vvec_2 = \left[\begin{array}{r} 1 \\ 2 \end{array}\right]\text{,} \end{equation*}

we may define a new coordinate system in which a point \(\{c,d\}\) will correspond to the vector

\begin{equation*} c\vvec_1 + d\vvec_2\text{.} \end{equation*}

For instance, the point \(\{2,-3\}\) is shown on the right side of Figure 2.3.4.

Write the point \(\{2,-3\}\) in standard coordinates; that is, find \(x\) and \(y\) such that

\begin{equation*} (x,y) = \{2,-3\}\text{.} \end{equation*}
Write the point \((2,-3)\) in the new coordinate system; that is, find \(c\) and \(d\) such that

\begin{equation*} \{c,d\} = (2,-3)\text{.} \end{equation*}
Convert a general point \(\{c,d\}\text{,}\) expressed in the new coordinate system, into standard Cartesian coordinates \((x,y)\text{.}\)
What is the general strategy for converting a point from standard Cartesian coordinates \((x,y)\) to the new coordinates \(\{c,d\}\text{?}\) Actually implementing this strategy in general may take a bit of work so just describe the strategy. We will study this in more detail later.

6.

Shown below are some traffic patterns in the downtown area of a large city. The figures give the number of cars per hour traveling along each road. Any car that drives into an intersection must also leave the intersection. This means that the number of cars entering an intersection in an hour is equal to the number of cars leaving the intersection.

Let’s begin with the following traffic pattern.
1. How many cars per hour enter the upper left intersection? How many cars per hour leave this intersection? Use this to form a linear equation in the variables \(x\text{,}\) \(y\text{,}\) \(z\text{,}\) and \(w\text{.}\)
2. Form three more linear equations from the other three intersections to form a linear system having four equations in four variables. Then use Python to find the solution space to this system.
3. Is there exactly one solution or infinitely many solutions? Explain why you would expect this given the information provided.
Another traffic pattern is shown below.
1. Once again, write a linear system for the quantities \(x\text{,}\) \(y\text{,}\) \(z\text{,}\) and \(w\) and solve the system using the Python cell below.
2. What can you say about the solution of this linear system? Is there exactly one solution or infinitely many solutions? Explain why you would expect this given the information provided.
3. What is the smallest possible amount of traffic flowing through \(x\text{?}\)

7.

A typical problem in thermodynamics is to find the steady-state temperature distribution inside a thin plate if we know the temperature around the boundary. Let \(T_1, T_2, \ldots, T_6\) be the temperatures at the six nodes inside the plate as shown below.

The temperature at a node is approximately the average of the four nearest nodes: for instance,

\begin{equation*} T_1 = (10 + 15 + T_2 + T_4)/4\text{,} \end{equation*}

which we may rewrite as

\begin{equation*} 4T_1 - T_2 - T_4 = 25\text{.} \end{equation*}

Set up a linear system to find the temperature at these six points inside the plate. Then use Python to solve the linear system.

In the real world, the approximation becomes better as we add more and more points into the grid. This is a situation where we may want to solve a linear system having millions of equations and millions of variables.

8.

The fuel inside model rocket motors is a black powder mixture that ideally consists of 60% charcoal, 30% potassium nitrate, and 10% sulfur by weight.

Suppose you work at a company that makes model rocket motors. When you come into work one morning, you learn that yesterday’s first shift made a perfect batch of fuel. The second shift, however, misread the recipe and used 50% charcoal, 20% potassium nitrate and 30% sulfur. Then the two batches were mixed together. A chemical analysis shows that there are 100.3 pounds of charcoal in the mixture and 46.9 pounds of potassium nitrate.

Assuming the first shift produced \(x\) pounds of fuel and the second \(y\) pounds, set up a linear system in terms of \(x\) and \(y\text{.}\) How many pounds of fuel did the first shift produce and how many did the second shift produce?
How much sulfur would you expect to find in the mixture?

9.

This exercise is about balancing chemical reactions.

Chemists denote a molecule of water as \(\text{H}_2\text{O}\text{,}\) which means it is composed of two atoms of hydrogen (H) and one atom of oxygen (O). The process by which hydrogen burns is described by the chemical reaction

\begin{equation*} x\thinspace \text{H}_2 + y\thinspace\text{O}_2 \to z\thinspace \text{H}_2\text{O} \end{equation*}

This means that \(x\) molecules of hydrogen \(\text{H}_2\) combine with \(y\) molecules of oxygen \(\text{O}_2\) to produce \(z\) water molecules. The number of hydrogen atoms is the same before and after the reaction; the same is true of the oxygen atoms.
1. In terms of \(x\text{,}\) \(y\text{,}\) and \(z\text{,}\) how many hydrogen atoms are there before the reaction? How many hydrogen atoms are there after the reaction? Find a linear equation in \(x\text{,}\) \(y\text{,}\) and \(z\) by equating these quantities.
2. Find a second linear equation in \(x\text{,}\) \(y\text{,}\) and \(z\) by equating the number of oxygen atoms before and after the reaction.
3. Find the solutions of this linear system. Why are there infinitely many solutions?
4. In this chemical setting, \(x\text{,}\) \(y\text{,}\) and \(z\) should be positive integers. Find the solution where \(x\text{,}\) \(y\text{,}\) and \(z\) are the smallest possible positive integers.
Now consider the reaction where potassium permanganate and manganese sulfate combine with water to produce manganese dioxide, potassium sulfate, and sulfuric acid:

\begin{equation*} x_1\thinspace \text{K}\text{Mn}\text{O}_4 + x_2\thinspace \text{Mn}\text{S}\text{O}_4 + x_3\thinspace \text{H}_2\text{O} \to x_4\thinspace \text{Mn}\text{O}_2 + x_5\thinspace \text{K}_2\text{S}\text{O}_4 + x_6\thinspace \text{H}_2\text{S}\text{O}_4. \end{equation*}

As in the previous exercise, find the appropriate values for \(x_1, x_2, \ldots, x_6\) to balance the chemical reaction.

10.

We began this section by stating that increasing computational power has helped linear algebra assume a prominent role as a scientific tool. Later, we looked at one computational limitation: once a matrix gets to be too big, it is not reasonable to apply Gaussian elimination to find its reduced row echelon form.

In this exercise, we will see another limitation: computer arithmetic with real numbers is only an approximation because computers represent real numbers with only a finite number of bits. For instance, the number pi

\begin{equation*} \pi = 3.141592653589793238462643383279502884197169399\ldots \end{equation*}

would be approximated inside a computer by, say,

\begin{equation*} \pi\approx 3.141592653589793 \end{equation*}

Most of the time, this is not a problem. However, when we perform millions or even billions of arithmetic operations, the error in these approximations starts to accumulate and can lead to results that are wildly inaccurate. Here are two examples demonstrating this.

Let’s first see an example showing that computer arithmetic really is an approximation. Consider the linear system

\begin{equation*} \begin{alignedat}{4} x \amp {}+{} \amp \frac12y \amp {}+{} \amp \frac13z \amp {}={} \amp 1 \\ \frac12x \amp {}+{} \amp \frac13y \amp {}+{} \amp \frac14z \amp {}={} \amp 0 \\ \frac13x \amp {}+{} \amp \frac14y \amp {}+{} \amp \frac15z \amp {}={} \amp 0 \\ \end{alignedat} \end{equation*}

Let’s solve this three ways, taking advantage of some features of sympy. As the name suggests, sympy is capable of doing symbolic mathematics as well as numerical mathematics. In particular, sympy can work with rational numbers (fractions) and do exact arithmetic with them. Consider these three ways to represent \(\frac13\text{:}\)
The first is our usual floating point approximation. The second attempts to approximate that approximation with a rational number. (Sometimes this works better; try s.Rational(1/2), for example.) The third represents \(\frac13\) exactly by storing the numerator and denominator.

Create the augmented matrix for this linear system in the three corresponding ways. Call them A, B, and C. With each, have sympy compute the reduce row echelon form and use it to find the solutoin the original linear system.
Most computers do arithmetic using either 32 or 64 bits. We can exagerate the effect by rounding the matrix more heavily. Solve the linear system again using
What does Python give for the solution now? Compare this to the exact solution that you found previously.
Some types of linear systems are particularly sensitive to errors resulting from computers’ approximate arithmetic. For instance, suppose we are interested in the linear system

\begin{equation*} \begin{alignedat}{3} x \amp {}+{} \amp y \amp {}={} \amp 2 \\ x \amp {}+{} 1.001\amp y \amp {}={} \amp 2 \\ \end{alignedat} \end{equation*}

Find the solution to this linear system.
Suppose now that the computer has accumulated some error in one of the entries of this system so that it incorrectly stores the system as

\begin{equation*} \begin{alignedat}{3} x \amp {}+{} \amp y \amp {}={} \amp 2 \\ x \amp {}+{} 1.001\amp y \amp {}={} \amp 2.001 \\ \end{alignedat} \end{equation*}

Find the solution to this linear system.
Notice how a small error in one of the entries in the linear system leads to a solution that has a dramatically large error. Fortunately, this is an issue that has been well studied, and there are techniques that mitigate this type of behavior. (See Section 4.7.)

11.

Consider the system of linear equations

Find the matrix \(A\) and vector \(\bvec\) that expresses this linear system in the form \(A\xvec=\bvec\text{.}\)
Give a description of the solution space to the equation \(A\xvec = \bvec\text{.}\)

12.

Suppose that

\begin{equation*} A=\left[\begin{array}{rrr} 1 \amp 0 \amp 2 \\ 2 \amp 2 \amp 2 \\ -1 \amp -3 \amp 1 \end{array}\right]\text{.} \end{equation*}

Describe the solution space to the equation \(A\xvec = \zerovec\text{.}\)
Find a \(3\by2\) matrix \(B\) with no zero entries such that \(AB = 0\text{.}\)

13.

Consider the matrix

\begin{equation*} A=\left[\begin{array}{rrrr} 1 \amp 2 \amp -4 \amp -4 \\ 2 \amp 3 \amp 0 \amp 1 \\ 1 \amp 0 \amp 4 \amp 6 \\ \end{array}\right]\text{.} \end{equation*}

Find the product \(A\xvec\) where

\begin{equation*} \xvec = \fourvec{1}{-2}{0}{2}\text{.} \end{equation*}
Give a description of the vectors \(\xvec\) such that

\begin{equation*} A\xvec = \threevec{-1}{15}{17}\text{.} \end{equation*}
Find the reduced row echelon form of \(A\) and identify the pivot positions.
Can you find a vector \(\bvec\) such that \(A\xvec=\bvec\) is inconsistent?
For a general 3-dimensional vector \(\bvec\text{,}\) what can you say about the solution space of the equation \(A\xvec = \bvec\text{?}\)

14.

The operations that we perform in Gaussian elimination can be accomplished using matrix multiplication. This observation is the basis of an important technique that we will investigate in a subsequent chapter.

Let’s consider the matrix

\begin{equation*} A = \left[\begin{array}{rrr} 1 \amp 2 \amp -1 \\ 2 \amp 0 \amp 2 \\ -3 \amp 2 \amp 3 \\ \end{array}\right]\text{.} \end{equation*}

Suppose that

\begin{equation*} S = \left[\begin{array}{rrr} 1 \amp 0 \amp 0 \\ 0 \amp 7 \amp 0 \\ 0 \amp 0 \amp 1 \\ \end{array}\right]\text{.} \end{equation*}

Verify that \(SA\) is the matrix that results when the second row of \(A\) is scaled by a factor of 7. What matrix \(S\) would scale the third row by -3?
Suppose that

\begin{equation*} P = \left[\begin{array}{rrr} 0 \amp 1 \amp 0 \\ 1 \amp 0 \amp 0 \\ 0 \amp 0 \amp 1 \\ \end{array}\right]\text{.} \end{equation*}

Verify that \(PA\) is the matrix that results from interchanging the first and second rows. What matrix \(P\) would interchange the first and third rows?
Suppose that

\begin{equation*} L_1 = \left[\begin{array}{rrr} 1 \amp 0 \amp 0 \\ -2 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \\ \end{array}\right]\text{.} \end{equation*}

Verify that \(L_1A\) is the matrix that results from multiplying the first row of \(A\) by \(-2\) and adding it to the second row. What matrix \(L_2\) would multiply the first row by 3 and add it to the third row?
When we performed Gaussian elimination, our first goal was to perform row operations that brought the matrix into a triangular form. For our matrix \(A\text{,}\) find the row operations needed to find a row equivalent matrix \(U\) in triangular form. By expressing these row operations in terms of matrix multiplication, find a matrix \(L\) such that \(LA = U\text{.}\)

15.

In this exercise, you will construct the inverse of a matrix, a subject that we will investigate more fully in the Chapter 4. Suppose that \(A\) is the \(2\by2\) matrix:

\begin{equation*} A = \left[\begin{array}{rr} 3 \amp -2 \\ -2 \amp 1 \\ \end{array}\right]\text{.} \end{equation*}

Find the vectors \(\bvec_1\) and \(\bvec_2\) such that the matrix \(B=\left[\begin{array}{rr} \bvec_1 \amp \bvec_2 \end{array}\right]\) satisfies

\begin{equation*} AB = I = \left[\begin{array}{rr} 1 \amp 0 \\ 0 \amp 1 \\ \end{array}\right]\text{.} \end{equation*}
In general, it is not true that \(AB = BA\text{.}\) Check that it is true, however, for the specific \(A\) and \(B\) that appear in this problem.
Suppose that \(\xvec = \twovec{x_1}{x_2}\text{.}\) What do you find when you evaluate \(I\xvec\text{?}\)
Suppose that we want to solve the equation \(A\xvec = \bvec\text{.}\) We know how to do this using Gaussian elimination; let’s use our matrix \(B\) to find a different way:

\begin{equation*} \begin{aligned} A\xvec \amp {}={} \bvec \\ B(A\xvec) \amp {}={} B\bvec \\ (BA)\xvec \amp {}={} B\bvec \\ I\xvec \amp {}={} B\bvec \\ \xvec \amp {}={} B\bvec \\ \end{aligned}\text{.} \end{equation*}

In other words, the solution to the equation \(A\xvec=\bvec\) is \(\xvec = B\bvec\text{.}\)

Consider the equation \(A\xvec = \twovec{5}{-2}\text{.}\) Find the solution in two different ways, first using Gaussian elimination and then as \(\xvec = B\bvec\text{,}\) and verify that you have found the same result.

16.

Determine whether the following statements are true or false and provide a justification for your response.

If \(A\xvec\) is defined, then the number of components of \(\xvec\) equals the number of rows of \(A\text{.}\)
The solution space to the equation \(A\xvec = \bvec\) is equivalent to the solution space to the linear system whose augmented matrix is \(\left[\begin{array}{r|r} A \amp \bvec \end{array}\right]\text{.}\)
If a linear system of equations has 8 equations and 5 unknowns, then the shape of the matrix \(A\) in the corresponding equation \(A\xvec = \bvec\) is \(5\by8\text{.}\)
If \(A\) has a pivot position in every row, then every equation \(A\xvec = \bvec\) is consistent.
If \(A\) is a \(9\by5\) matrix, then \(A\xvec=\bvec\) is inconsistent for some vector \(\bvec\text{.}\)

17.

Suppose that \(A\) is an \(4\by4\) matrix and that the equation \(A\xvec = \bvec\) has a unique solution for some vector \(\bvec\text{.}\)

What does this say about the pivot positions of the matrix \(A\text{?}\) Write the reduced row echelon form of \(A\text{.}\)
Can you find another vector \(\cvec\) such that \(A\xvec = \cvec\) is inconsistent?
What can you say about the solution space to the equation \(A\xvec = \zerovec\text{?}\)
Suppose \(A=\left[\begin{array}{rrrr} \vvec_1 \amp \vvec_2 \amp \vvec_3 \amp \vvec_4 \end{array}\right]\text{.}\) Explain why every four-dimensional vector can be written as a linear combination of the vectors \(\vvec_1\text{,}\) \(\vvec_2\text{,}\) \(\vvec_3\text{,}\) and \(\vvec_4\) in exactly one way.

18.

Define the matrix

\begin{equation*} A = \left[\begin{array}{rrr} 1 \amp 2 \amp 4 \\ -2 \amp 1 \amp -3 \\ 3 \amp 1 \amp 7 \\ \end{array}\right]\text{.} \end{equation*}

Describe the solution space to the homogeneous equation \(A\xvec = \zerovec\) using a parametric description, if appropriate. What does this solution space represent geometrically?
Describe the solution space to the equation \(A\xvec=\bvec\) where \(\bvec = \threevec{-3}{-4}{1}\text{.}\) What does this solution space represent geometrically and how does it compare to the previous solution space?
We will now explain the relationship between the previous two solution spaces. Suppose that \(\xvec_h\) is a solution to the homogeneous equation; that is \(A\xvec_h=\zerovec\text{.}\) Suppose also that \(\xvec_p\) is a solution to the equation \(A\xvec = \bvec\text{;}\) that is, \(A\xvec_p=\bvec\text{.}\)

Use the Linearity Principle expressed in Proposition 1.4.6 to explain why \(\xvec_h+\xvec_p\) is a solution to the equation \(A\xvec = \bvec\text{.}\) You may do this by evaluating \(A(\xvec_h+\xvec_p)\text{.}\)

That is, if we find one solution \(\xvec_p\) to an equation \(A\xvec = \bvec\text{,}\) we may add any solution to the homogeneous equation to \(\xvec_p\) and still have a solution to the equation \(A\xvec = \bvec\text{.}\) In other words, the solution space to the equation \(A\xvec = \bvec\) is given by translating the solution space to the homogeneous equation by the vector \(\xvec_p\text{.}\)

19.

Suppose that a city is starting a bicycle sharing program with bicycles at locations \(B\) and \(C\text{.}\) Bicycles that are rented at one location may be returned to either location at the end of the day. Over time, the city finds that 80% of bicycles rented at location \(B\) are returned to \(B\) with the other 20% returned to \(C\text{.}\) Similarly, 50% of bicycles rented at location \(C\) are returned to \(B\) and 50% to \(C\text{.}\)

To keep track of the bicycles, we form a vector

\begin{equation*} \xvec_k = \twovec{B_k}{C_k} \end{equation*}

where \(B_k\) is the number of bicycles at location \(B\) at the beginning of day \(k\) and \(C_k\) is the number of bicycles at \(C\text{.}\) The information above tells us how to determine the distribution of bicycles the following day:

\begin{equation*} \begin{alignedat}{3} B_{k+1} \amp {}={} 0.8B_k \amp {}+{} \amp 0.5 C_k \\ C_{k+1} \amp {}={} 0.2B_k \amp {}+{} \amp 0.5 C_k. \\ \end{alignedat} \end{equation*}

Expressed in matrix-vector form, these expressions give

\begin{equation*} \xvec_{k+1} = A\xvec_k \end{equation*}

where

\begin{equation*} A = \left[\begin{array}{rr} 0.8 \amp 0.5 \\ 0.2 \amp 0.5 \\ \end{array}\right]\text{.} \end{equation*}

Let’s check that this makes sense.
1. Suppose that there are 1000 bicycles at location \(B\) and none at \(C\) on day 1. This means we have \(\xvec_1 = \twovec{1000}{0}\text{.}\) Find the number of bicycles at both locations on day 2 by evaluating \(\xvec_2 = A\xvec_1\text{.}\)
2. Suppose that there are 1000 bicycles at location \(C\) and none at \(B\) on day 1. Form the vector \(\xvec_1\) and determine the number of bicycles at the two locations the next day by finding \(\xvec_2 = A\xvec_1\text{.}\)
Suppose that one day there are 1050 bicycles at location \(B\) and 450 at location \(C\text{.}\) How many bicycles were there at each location the previous day?
Suppose that there are 500 bicycles at location \(B\) and 500 at location \(C\) on Monday. How many bicycles are there at the two locations on Tuesday? on Wednesday? on Thursday?

20.

This problem is a continuation of the previous problem.

Let us define vectors

\begin{equation*} \vvec_1 = \twovec{5}{2}, ~~~ \vvec_2 = \twovec{-1}{1}\text{.} \end{equation*}

Show that

\begin{equation*} A\vvec_1 = \vvec_1, ~~~A\vvec_2 = 0.3\vvec_2\text{.} \end{equation*}
Suppose that \(\xvec_1 = c_1 \vvec_1 + c_2 \vvec_2\) where \(c_1\) and \(c_2\) are scalars. Use the Linearity Principle expressed in Proposition 1.4.6 to explain why

\begin{equation*} \xvec_{2} = A\xvec_1 = c_1\vvec_1 + 0.3c_2\vvec_2\text{.} \end{equation*}
Continuing in this way, explain why

\begin{equation*} \begin{aligned} \xvec_{3} = A\xvec_2 \amp {}={} c_1\vvec_1 +0.3^2c_2\vvec_2 \\ \xvec_{4} = A\xvec_3 \amp {}={} c_1\vvec_1 +0.3^3c_2\vvec_2 \\ \xvec_{5} = A\xvec_4 \amp {}={} c_1\vvec_1 +0.3^4c_2\vvec_2 \\ \end{aligned}\text{.} \end{equation*}
Suppose that there are initially 500 bicycles at location \(B\) and 500 at location \(C\text{.}\) Write the vector \(\xvec_1\) and find the scalars \(c_1\) and \(c_2\) such that \(\xvec_1=c_1\vvec_1 + c_2\vvec_2\text{.}\)
Use the previous part of this problem to determine \(\xvec_2\text{,}\) \(\xvec_3\) and \(\xvec_4\text{.}\)
After a very long time, how are all the bicycles distributed?

Understanding Linear Algebra: Data Science Edition

Search Results:

Section 2.3 Computational Linear Algebra

Subsection 2.3.1 Reduced row echelon form in Python

Note 2.3.1. Augmented matrices in numpy.

Python syntax.

Activity 2.3.1. Using Python to find row reduced echelon matrices.

Proposition 2.3.2. Augmentation Principle.

Subsection 2.3.2 np.linalg.solve()

Subsection 2.3.3 Computational effort

Observation 2.3.3.

Subsection 2.3.4 Summary

Exercises 2.3.5 Exercises

1.

2.

3.

4.

5.

6.

7.

8.

9.

10.

11.

12.

13.

14.

15.

16.

17.

18.

19.

20.