Skip to main content

Section 1.4 Matrices

Our goal in this section is to introduce matrices, operations on matrices, and their connection to linear combinations and linear systems.

Subsection 1.4.1 Matrices and their uses

A matrix is a rectangular array of numbers. If we say that the shape of a matrix is \(m\by n\text{,}\) we mean that it has \(m\) rows and \(n\) columns. For instance, the shape of the matrix below is \(3\by4\text{:}\)
\begin{equation*} \left[ \begin{array}{rrrr} 0 \amp 4 \amp -3 \amp 1 \\ 3 \amp -1 \amp 2 \amp 0 \\ 2 \amp 0 \amp -1 \amp 1 \\ \end{array} \right]\text{.} \end{equation*}
There are many applications that use matrices, here are just a few.
  1. Rectangular data.
    The typical organization places observational units (usually referred to a as a case or a subject when the observational units are people) in rows and the variables recorded for each of them in columns. So if we have \(k\) numerical variables recorded for each of \(n\) observational units, we can store this information in an \(n \by k\) matrix. For categorical variables, we will need to convert the values into numbers.
  2. Images.
    Digital images can also be stored in matrices. For a black and white image, each value in the matrix represents the color on a gray scale with 0 representing black, 255 representing white, and values between these extremes representing different shades of gray.
    For color images, we can use several matrices. For an RGB image, for example, there will be a matrix for red, another for green, and a third for blue. The resulting images combine values from each of these layers. See Section 1.5 for a discussion of how to combine multiple layers into a single object, called a tensor.
  3. Coefficient matrices for linear systems.
    All the important information in a linear system of equations like
    \begin{equation*} \begin{alignedat}{4} -x \amp {}-{} \amp 2y \amp {}+{} \amp 2z \amp {}={} \amp -1 \\ 2x \amp {}+{} \amp 4y \amp {}-{} \amp z \amp {}={} \amp 5 \\ x \amp {}+{} \amp 2y \amp \amp \amp {}={} \amp 3 \\ \end{alignedat} \end{equation*}
    can be stored in a matrix of coefficients for the left hand side and a vector for the right hand side.
    \begin{equation*} A = \left[ \begin{array}{rrr} -1 \amp -2 \amp 2 \\ 2 \amp 4 \amp -1 \\ 1 \amp 2 \amp 0 \\ \end{array} \right]; \qquad \bvec = \left[ \begin{array}{r} -1 \\ 5 \\ 3 \\ \end{array} \right] \end{equation*}
    We could even combine both the matrix and vector into a single matrix by adding the vector \(\bvec\) as an additional column. We will call such a a matrix an augmented matrix because we are augmenting the original matrix by addiging an additional column (or columns):
    \begin{equation*} \left[ \begin{array}{rrr|r} -1 \amp -2 \amp 2 \amp -1 \\ 2 \amp 4 \amp -1 \amp 5 \\ 1 \amp 2 \amp 0 \amp 3 \\ \end{array} \right]. \end{equation*}
    As we will soon see, this matrix representation will allow us to solve these systems computationally.
Recalling our connection between linear combinations of vectors and linear systems of equations, the linear system above will have a solution if and only if
\begin{equation*} \bvec = \threevec{-1}{5}{3} \end{equation*}
is a linear combination of the vectors
\begin{equation*} \vvec_1 = \threevec{-1}{2}{1}, \vvec_2 = \threevec{-2}{4}{2}, \vvec_3 = \threevec{2}{-1}{0}. \end{equation*}
As shorthand, we can write this augmented matrix
\begin{equation*} \left[ \begin{array}{rrr|r} -1 \amp -2 \amp 2 \amp -1 \\ 2 \amp 4 \amp -1 \amp 5 \\ 1 \amp 2 \amp 0 \amp 3 \\ \end{array} \right]. \end{equation*}
replacing each column with its vector representation:
\begin{equation*} \left[ \begin{array}{rrr|r} \vvec_1 \amp \vvec_2 \amp \vvec_2 \amp \bvec \end{array} \right]\text{.} \end{equation*}
Using this shorthand, we can restate the connection between linear combinations of vectors and linear systems of equations as the following proposition.

Subsection 1.4.2 Scalar multiplication and addition of matrices

It is often useful to think of the columns of a matrix as vectors. For instance, the matrix
\begin{equation*} \left[ \begin{array}{rrrr} 0 \amp 4 \amp -3 \amp 1 \\ 3 \amp -1 \amp 2 \amp 0 \\ 2 \amp 0 \amp -1 \amp 1 \\ \end{array} \right]\text{.} \end{equation*}
may be represented as
\begin{equation*} \left[ \begin{array}{rrrr} \vvec_1 \amp \vvec_2 \amp \vvec_3 \amp \vvec_4 \end{array} \right] \end{equation*}
where
\begin{equation*} \vvec_1=\left[\begin{array}{r}0\\3\\2\\ \end{array}\right], \vvec_2=\left[\begin{array}{r}4\\-1\\0\\ \end{array}\right], \vvec_3=\left[\begin{array}{r}-3\\2\\-1\\ \end{array}\right], \vvec_4=\left[\begin{array}{r}1\\0\\1\\ \end{array}\right]\text{.} \end{equation*}
In this way, we see that the \(3\by 4\) matrix is equivalent to an ordered list of 4 vectors in \(\real^3\text{.}\)
This means that we may define scalar multiplication and matrix addition operations using the corresponding column-wise vector operations. For instance,
\begin{equation*} \begin{aligned} c\left[\begin{array}{rrrr} \vvec_1 \amp \vvec_2 \amp \cdots \amp \vvec_n \end{array} \right] {}={} \amp \left[\begin{array}{rrrr} c\vvec_1 \amp c\vvec_2 \amp \cdots \amp c\vvec_n \end{array} \right] \\ \left[\begin{array}{rrrr} \vvec_1 \amp \vvec_2 \amp \cdots \amp \vvec_n \end{array} \right] {}+{} \amp \left[\begin{array}{rrrr} \wvec_1 \amp \wvec_2 \amp \cdots \amp \wvec_n \end{array} \right] \\ {}={} \amp \left[\begin{array}{rrrr} \vvec_1+\wvec_1 \amp \vvec_2+\wvec_2 \amp \cdots \amp \vvec_n+\wvec_n \end{array} \right]. \\ \end{aligned} \end{equation*}
In other words,
  • scalar multiplication with a matrix multiplies every element of the matrix by the scalar, and
  • two matrices of the same shape can be added componentwise.

Preview Activity 1.4.1. Matrix operations.

  1. Compute the scalar multiple
    \begin{equation*} -3\left[ \begin{array}{rrr} 3 \amp 1 \amp 0 \\ -4 \amp 3 \amp -1 \\ \end{array} \right]\text{.} \end{equation*}
  2. Find the sum
    \begin{equation*} \left[ \begin{array}{rr} 0 \amp -3 \\ 1 \amp -2 \\ 3 \amp 4 \\ \end{array} \right] + \left[ \begin{array}{rrr} 4 \amp -1 \\ -2 \amp 2 \\ 1 \amp 1 \\ \end{array} \right]\text{.} \end{equation*}
  3. Suppose that \(A\) and \(B\) are two matrices. What do we need to know about their shapes before we can form the sum \(A+B\text{?}\) In this situation, is it always the case that \(A+B = B+A\text{?}\)
  4. The matrix \(I_n\text{,}\) which we call the identity matrix, is the \(n\by n\) matrix whose entries are zero except for the main diagonal entries, all of which are 1. (The main diagonal is the diagonal going from top left to bottom right. These are the elements for which the row index and column index are the same.) For instance,
    \begin{equation*} I_3 = \left[ \begin{array}{rrr} 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \\ \end{array} \right]\text{.} \end{equation*}
    If we can form the sum \(A+I_n\text{,}\) what must be true about the matrix \(A\text{?}\)
  5. Find the matrix \(A - 2I_3\) where
    \begin{equation*} A = \left[ \begin{array}{rrr} 1 \amp 2 \amp -2 \\ 2 \amp -3 \amp 3 \\ -2 \amp 3 \amp 4 \\ \end{array} \right]\text{.} \end{equation*}
Answer.
  1. \(\displaystyle \left[\begin{array}{rrr} -9 \amp -3 \amp 0 \\ 12 \amp -9 \amp 3 \\ \end{array}\right]\)
  2. \(\displaystyle \left[\begin{array}{rr} 4 \amp -4 \\ -1 \amp 0 \\ 4 \amp 5 \\ \end{array}\right]\)
  3. The shapes must be the same and both sums will also be the same.
  4. The shape of \(A\) must be \(n\by n\text{.}\)
  5. \(\displaystyle A-2I_3 = \left[\begin{array}{rrr} -1 \amp 2 \amp -2 \\ 2 \amp -5 \amp 3 \\ -2 \amp 3 \amp 2 \\ \end{array}\right]\)
Solution.
  1. \(\displaystyle \left[\begin{array}{rrr} -9 \amp -3 \amp 0 \\ 12 \amp -9 \amp 3 \\ \end{array}\right]\)
  2. \(\displaystyle \left[\begin{array}{rr} 4 \amp -4 \\ -1 \amp 0 \\ 4 \amp 5 \\ \end{array}\right]\)
  3. The shapes must be the same.
  4. The shape of \(A\) must be \(n\by n\text{.}\) In this case \(A+B\) and \(B+A\) will be the same because for any numbers \(a\) and \(b\text{,}\) \(a+b = b+a\text{,}\) so we get the same result for each element of the sum matrices.
  5. \(\displaystyle A-2I_3 = \left[\begin{array}{rrr} -1 \amp 2 \amp -2 \\ 2 \amp -5 \amp 3 \\ -2 \amp 3 \amp 2 \\ \end{array}\right]\)
As this preview activity shows, the operations of scalar multiplication and addition of matrices are natural extensions of their vector counterparts. Some care, however, is required when adding matrices. Since we need the same number of vectors to add and since those vectors must be of the same dimension, two matrices must have the same shape if we wish to form their sum.

Subsection 1.4.3 Matrix-vector multiplication and linear combinations

A more important operation will be matrix multiplication as it allows us to compactly express linear systems. We now introduce the product of a matrix and a vector with an example.

Example 1.4.2. Matrix-vector multiplication.

Suppose we have the matrix \(A\) and vector \(\xvec\text{:}\)
\begin{equation*} A = \left[\begin{array}{rr} -2 \amp 3 \\ 0 \amp 2 \\ 3 \amp 1 \\ \end{array}\right],~~~ \xvec = \left[\begin{array}{r} 2 \\ 3 \\ \end{array}\right]\text{.} \end{equation*}
Their product will be defined to be the linear combination of the columns of \(A\) using the components of \(\xvec\) as weights. This means that
\begin{equation*} \begin{aligned} A\xvec = \left[\begin{array}{rr} -2 \amp 3 \\ 0 \amp 2 \\ 3 \amp 1 \\ \end{array}\right] \left[\begin{array}{r} 2 \\ 3 \\ \end{array}\right] {}={} \amp 2 \left[\begin{array}{r} -2 \\ 0 \\ 3 \\ \end{array}\right] + 3 \left[\begin{array}{r} 3 \\ 2 \\ 1 \\ \end{array}\right] \\ \\ {}={} \amp \left[\begin{array}{r} -4 \\ 0 \\ 6 \\ \end{array}\right] + \left[\begin{array}{r} 9 \\ 6 \\ 3 \\ \end{array}\right] \\ \\ {}={} \amp \left[\begin{array}{r} 5 \\ 6 \\ 9 \\ \end{array}\right]. \\ \end{aligned} \end{equation*}
Because \(A\) has two columns, we need two weights to form a linear combination of those columns, which means that \(\xvec\) must have two components. In other words, the number of columns of \(A\) must equal the dimension of the vector \(\xvec\text{.}\)
Similarly, the columns of \(A\) are 3-dimensional so any linear combination of them is 3-dimensional as well. Therefore, \(A\xvec\) will be 3-dimensional.
We then see that if \(A\) is a \(3\by2\) matrix, \(\xvec\) must be a 2-dimensional vector and \(A\xvec\) will be 3-dimensional.
More generally, we have the following definition.

Definition 1.4.3. Matrix-vector multiplication.

The product of a matrix \(A\) by a vector \(\xvec\) will be the linear combination of the columns of \(A\) using the components of \(\xvec\) as weights. More specifically, if
\begin{equation*} A=\left[\begin{array}{rrrr} \vvec_1 \amp \vvec_2 \amp \ldots \amp \vvec_n \end{array}\right],~~~ \xvec = \left[\begin{array}{r} x_1 \\ x_2 \\ \vdots \\ x_n \end{array}\right], \end{equation*}
then
\begin{equation*} A\xvec = x_1\vvec_1 + x_2\vvec_2 + \ldots + x_n\vvec_n\text{.} \end{equation*}
If \(A\) is an \(m\by n\) matrix, then \(\xvec\) must be an \(n\)-dimensional vector, and the product \(A\xvec\) will be an \(m\)-dimensional vector.
The next activity explores some properties of matrix-vector multiplication.

Activity 1.4.2. Matrix-vector multiplication.

  1. Find the matrix product
    \begin{equation*} \left[ \begin{array}{rrrr} 1 \amp 2 \amp 0 \amp -1 \\ 2 \amp 4 \amp -3 \amp -2 \\ -1 \amp -2 \amp 6 \amp 1 \\ \end{array} \right] \left[ \begin{array}{r} 3 \\ 1 \\ -1 \\ 1 \\ \end{array} \right]\text{.} \end{equation*}
  2. Suppose that \(A\) is the matrix
    \begin{equation*} \left[ \begin{array}{rrr} 3 \amp -1 \amp 0 \\ 0 \amp -2 \amp 4 \\ 2 \amp 1 \amp 5 \\ 1 \amp 0 \amp 3 \\ \end{array} \right]\text{.} \end{equation*}
    If \(A\xvec\) is defined, what is the dimension of the vector \(\xvec\) and what is the dimension of \(A\xvec\text{?}\)
  3. A vector whose entries are all zero is denoted by \(\zerovec\text{.}\) If \(A\) is a matrix, what is the product \(A\zerovec\text{?}\)
  4. Suppose that \(I = \left[\begin{array}{rrr} 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \\ \end{array}\right]\) is the identity matrix and \(\xvec=\threevec{x_1}{x_2}{x_3}\text{.}\) Find the product \(I\xvec\) and explain why \(I\) is called the identity matrix.
  5. Suppose we write the matrix \(A\) in terms of its columns as
    \begin{equation*} A = \left[ \begin{array}{rrrr} \vvec_1 \amp \vvec_2 \amp \cdots \amp \vvec_n \\ \end{array} \right]\text{.} \end{equation*}
    If the vector \(\evec_1 = \left[\begin{array}{c} 1 \\ 0 \\ \vdots \\ 0 \end{array}\right]\text{,}\) what is the product \(A\evec_1\text{?}\)
  6. Suppose that
    \begin{equation*} A = \left[ \begin{array}{rrrr} 1 \amp 2 \\ -1 \amp 1 \\ \end{array} \right], \quad \xvec = \twovec{x_1}{y_1}, \quad \bvec = \left[ \begin{array}{r} 6 \\ 0 \end{array} \right]\text{.} \end{equation*}
    Express \(A \xvec = \bvec\) as a linear system of equations.
Answer.
  1. \(\threevec{4}{11}{9} \text{.}\)
  2. The dimension of \(\xvec\) must three, and the dimension of \(A\xvec\) must be four.
  3. \(A\zerovec = \zerovec\text{.}\)
  4. \(I\xvec = \xvec\text{.}\)
  5. \(A\evec_1 = 1\vvec_1+0\vvec_2+\ldots+0\vvec_n = \vvec_1\text{.}\)
  6. Carry out the multiplication and simplify to obtain a linear system.
Solution.
  1. We have
    \begin{equation*} \begin{alignedat}{2} \left[\begin{array}{rrrr} 1 \amp 2 \amp 0 \amp -2 \\ 2 \amp 4 \amp -3 \amp -2 \\ -1 \amp -2 \amp 6 \amp 1 \\ \end{array}\right] \amp \fourvec{3}{1}{-1}{1} \\ \amp = \threevec{3(1)+1(2)-1(0)+1(-1)} {3(2)+1(4)-1(-3)+1(-2)} {3(-1)+1(-2)-1(6)+1(1)} \\ \amp =\threevec{4}{11}{10} \end{alignedat}\text{.} \end{equation*}
  2. The dimension of \(\xvec\) must be the same as the number of columns of \(A\) so \(\xvec\) is three-dimensional. The dimension of \(A\xvec\) equals the number of rows of \(A\) so \(A\xvec\) is four-dimensional.
  3. We have \(A\zerovec = \zerovec\text{.}\)
  4. We have \(I\xvec=\xvec\text{;}\) that is, multiplying a vector by \(I\) produces the same vector.
  5. The product \(A\evec_1 = 1\vvec_1+0\vvec_2+\ldots+0\vvec_n = \vvec_1\text{.}\)
  6. If \(A\xvec=\bvec\text{,}\) then we have
    \begin{equation*} \begin{alignedat}{3} x_1 \amp {}+{} \amp 2x_2 \amp {}={} \amp 6 \\ -x_1 \amp {}+{} \amp x_2 \amp {}={} \amp 0 \\ \end{alignedat}\text{.} \end{equation*}
    \(\xvec=\twovec{2}{2}\) is the unique solution.
Multiplication of a matrix \(A\) and a vector is defined as a linear combination of the columns of \(A\text{.}\) However, there is another way to compute such a product. Let’s look at our previous example and focus on the first row of the product.
\begin{equation*} \left[\begin{array}{rr} -2 \amp 3 \\ 0 \amp 2 \\ 3 \amp 1 \\ \end{array}\right] \left[\begin{array}{r} 2 \\ 3 \\ \end{array}\right] = 2 \left[\begin{array}{r} -2 \\ * \\ * \\ \end{array}\right] + 3 \left[\begin{array}{r} 3 \\ * \\ * \\ \end{array}\right] = \left[\begin{array}{c} 2(-2)+3(3) \\ * \\ * \\ \end{array}\right] = \left[\begin{array}{r} 5 \\ * \\ * \\ \end{array}\right]\text{.} \end{equation*}
To find the first component of the product, we consider the first row of the matrix. We then multiply the first entry in that row by the first component of the vector, the second entry by the second component of the vector, and so on, and add the results. In this way, we see that the third component of the product would be obtained from the third row of the matrix by computing \(2(3) + 3(1) = 9\text{.}\)
You are encouraged to evaluate the product Item a of Example 1.4.8 using this new method and compare the result to what you found while completing that activity.

Subsection 1.4.4 Matrix-vector multiplication and linear systems

The connections among matrix-vector multiplication, linear combinations of vectors, and linear systems of equations are so important that they are worth repeating. So far, we have begun with a matrix \(A\) and a vector \(\xvec\) and formed their product \(A\xvec = \bvec\text{.}\) We would now like to turn this around: Suppose we know \(A\) and \(\bvec\) but don’t know \(\xvec\text{,}\) can find a vector \(\xvec\) such that \(A\xvec = \bvec\text{?}\) This question will naturally lead back to linear systems.
To see the connection between the matrix equation \(A\xvec = \bvec\) and linear systems, let’s write the matrix \(A\) in terms of its columns \(\vvec_i\) and \(\xvec\) in terms of its components.
\begin{equation*} A = \left[ \begin{array}{rrrr} \vvec_1 \amp \vvec_2 \amp \ldots \vvec_n \end{array} \right], \xvec = \left[ \begin{array}{c} x_1 \\ x_2 \\ \vdots \\ x_n \\ \end{array} \right]\text{.} \end{equation*}
We know that the matrix product \(A\xvec\) forms a linear combination of the columns of \(A\text{.}\) Therefore, the equation \(A\xvec = \bvec\) is merely a compact way of writing the equation for the weights \(x_i\text{:}\)
\begin{equation*} x_1\vvec_1 + x_2\vvec_2 + \ldots + x_n\vvec_n = \bvec\text{.} \end{equation*}
We have seen this equation before: Remember that Proposition 1.4.1 says that the solutions of this equation are the same as the solutions to the linear system whose augmented matrix is
\begin{equation*} \left[\begin{array}{rrrr|r} \vvec_1 \amp \vvec_2 \amp \ldots \amp \vvec_n \amp \bvec \end{array}\right]\text{.} \end{equation*}
This gives us three different ways of looking at the situation.
When the matrix \(A = \left[\begin{array}{rrrr} \vvec_1\amp\vvec_2\amp\cdots\amp\vvec_n\end{array}\right]\text{,}\) we will frequently write
\begin{equation*} \left[\begin{array}{rrrr|r} \vvec_1\amp\vvec_2\amp\cdots\amp\vvec_n\amp\bvec\end{array}\right] = \left[ \begin{array}{r|r} A \amp \bvec \end{array}\right] \end{equation*}
and say that the matrix \(A\) is augmented by the vector \(\bvec\text{.}\)
The equation \(A\xvec = \bvec\) gives a notationally compact way to write a linear system. Moreover, this notation will allow us to focus on important features of the system that determine its solutions.

Subsection 1.4.5 Matrices in Python

We may ask Python to create matrices by using nested lists (a list of lists). But just as we did with vectors, we will convert these nested lists to a numpy n-dimensional array using np.array().
The shape of the resulting array tells us the number of rows and columns in the matrix.

Example 1.4.5.

We may ask Python to create the \(2\by4\) matrix
\begin{equation*} \left[ \begin{array}{rrrr} -1 \amp 0 \amp 2 \amp 7 \\ 2 \amp 1 \amp -3 \amp -1 \\ \end{array} \right] \end{equation*}
by entering

Activity 1.4.3.

Python can find the product of a matrix and vector using the @ operator. For example,
  1. Use Python to evaluate the product
    \begin{equation*} \left[ \begin{array}{rrrr} 1 \amp 2 \amp 0 \amp -1 \\ 2 \amp 4 \amp -3 \amp -2 \\ -1 \amp -2 \amp 6 \amp 1 \\ \end{array} \right] \left[ \begin{array}{r} 3 \\ 1 \\ -1 \\ 1 \\ \end{array} \right] \end{equation*}
  2. In Python, define the matrix and vectors
    \begin{equation*} A = \left[ \begin{array}{rrr} -2 \amp 0 \\ 3 \amp 1 \\ 4 \amp 2 \\ \end{array} \right], \zerovec = \left[ \begin{array}{r} 0 \\ 0 \end{array} \right], \vvec = \left[ \begin{array}{r} -2 \\ 3 \end{array} \right], \wvec = \left[ \begin{array}{r} 1 \\ 2 \end{array} \right]\text{.} \end{equation*}
  3. What do you find when you evaluate \(A\zerovec\text{?}\)
  4. What do you find when you evaluate \(A(3\vvec)\) and \(3(A\vvec)\) and compare your results?
  5. What do you find when you evaluate \(A(\vvec+\wvec)\) and \(A\vvec + A\wvec\) and compare your results?
Answer.
  1. We define
    A = np.array([[1, 2, 0, -1],
                  [2, 4, -3, -2],
                  [-1, -2, 6, 1]])
    v = np.array([3, 1, -1, 1])
    A@v		  
    	    
  2. We define
    A = np.array([[-2, 0, 3],[1, 4, 2]])
    zero = np.array([0, 0])
    v = np.array([-2, 3])
    w = np.array([1, 2])
    	    
  3. \(A\zerovec = \zerovec\text{.}\)
  4. \(A(3\vvec) = 3(A\vvec)\text{.}\)
  5. \(\displaystyle A(\vvec+\wvec) = A\vvec + A\wvec\)
This activity demonstrates several general properties satisfied by matrix multiplication that we record here.

Python practices.

Here are some practices that you may find helpful when working with matrices in Python.
  • Break the matrix entries across lines, one for each row, for better readability by pressing Enter between rows.
    A = np.arry([[ 1, 2, -1, 0],
                 [-3, 0,  4, 3 ])
    	
  • For small matrices, you can print your original matrix to check that you have entered it correctly. For larger matrices, you should at least confirm that that shape matches your expectation.
  • You may want to also print labels or to include a dividing line to separate different parts of your output.
    A = np.array([ 1, 2, 2, 2])
    print (A)
    print ("---------")
    print("shape of A =", A.shape)
    	

Subsection 1.4.6 Matrix-matrix products

In this section, we have developed some algebraic operations on matrices and seen how they can be used to simplifying our description of linear systems. We now introduce a final operation, the product of two matrices, that will become important when we study linear transformations in Section 3.3.

Definition 1.4.7. Matrix-matrix multiplication.

Given matrices \(A\) and \(B\text{,}\) we form their product \(AB\) by first writing \(B\) in terms of its columns
\begin{equation*} B = \left[\begin{array}{rrrr} \vvec_1 \amp \vvec_2 \amp \cdots \amp \vvec_p \end{array}\right] \end{equation*}
and then defining
\begin{equation*} AB = \left[\begin{array}{rrrr} A\vvec_1 \amp A\vvec_2 \amp \cdots \amp A\vvec_p \end{array}\right]. \end{equation*}

Example 1.4.8.

Given the matrices
\begin{equation*} A = \left[\begin{array}{rr} 4 \amp 2 \\ 0 \amp 1 \\ -3 \amp 4 \\ 2 \amp 0 \\ \end{array}\right],~~~ B = \left[\begin{array}{rrr} -2 \amp 3 \amp 0 \\ 1 \amp 2 \amp -2 \\ \end{array}\right]\text{,} \end{equation*}
we have
\begin{equation*} AB = \left[\begin{array}{rrr} A \twovec{-2}{1} \amp A \twovec{3}{2} \amp A \twovec{0}{-2} \end{array}\right] = \left[\begin{array}{rrr} -6 \amp 16 \amp -4 \\ 1 \amp 2 \amp -2 \\ 10 \amp -1 \amp -8 \\ -4 \amp 6 \amp 0 \end{array}\right]\text{.} \end{equation*}
That is, we multiply \(A\) by each of the column vectors in \(B\) to obtain the column vectors of the product.

Observation 1.4.9.

It is important to note that we can only multiply matrices if the shapes of the matrices are compatible. More specifically, when constructing the product \(AB\text{,}\) the matrix \(A\) multiplies the columns of \(B\text{.}\) Therefore, the number of columns of \(A\) must equal the number of rows of \(B\text{.}\) When this condition is met, the number of rows of \(AB\) is the number of rows of \(A\text{,}\) and the number of columns of \(AB\) is the number of columns of \(B\text{.}\)
This can be visualized by writing out the shapes of the two matrices, one after the other. The "middle" values must match and "cancel", leaving the number of rows in the first matrix and the number of columns of the second matrix as the shape of the procuct. In Example 1.4.8, we have
\begin{equation*} (4, 2) (2, 3) \to (4, \cancel{2}) (\cancel{2}, 3) \to (4, 3) \end{equation*}

Activity 1.4.4.

Consider the matrices
\begin{equation*} A = \left[\begin{array}{rrr} 1 \amp 3 \amp 2 \\ -3 \amp 4 \amp -1 \\ \end{array}\right],~~~ B = \left[\begin{array}{rr} 3 \amp 0 \\ 1 \amp 2 \\ -2 \amp -1 \\ \end{array}\right]\text{.} \end{equation*}
  1. Before computing, first explain why the shapes of \(A\) and \(B\) enable us to form the product \(AB\text{.}\) Then describe the shape of \(AB\text{.}\)
  2. Compute the product \(AB\) by hand.
  3. Python can multiply matrices using the @ operator. Define the matrices \(A\) and \(B\) in the Python cell below and check your work by computing \(AB\text{.}\)
  4. Are we able to form the matrix product \(BA\text{?}\) If so, use the Python cell above to find \(BA\text{.}\) Is it generally true that \(AB = BA\text{?}\)
  5. Suppose we form the three matrices.
    \begin{equation*} A = \left[\begin{array}{rr} 1 \amp 2 \\ 3 \amp -2 \\ \end{array}\right], B = \left[\begin{array}{rr} 0 \amp 4 \\ 2 \amp -1 \\ \end{array}\right], C = \left[\begin{array}{rr} -1 \amp 3 \\ 4 \amp 3 \\ \end{array}\right]\text{.} \end{equation*}
    Compare what happens when you compute \(A(B+C)\) and \(AB + AC\text{.}\) State your finding as a general principle.
  6. Compare the results of evaluating \(A(BC)\) and \((AB)C\) and state your finding as a general principle.
  7. When we are dealing with real numbers, we know if \(a\neq 0\) and \(ab = ac\text{,}\) then \(b=c\text{.}\) Define matrices
    \begin{equation*} A = \left[\begin{array}{rr} 1 \amp 2 \\ -2 \amp -4 \\ \end{array}\right], B = \left[\begin{array}{rr} 3 \amp 0 \\ 1 \amp 3 \\ \end{array}\right], C = \left[\begin{array}{rr} 1 \amp 2 \\ 2 \amp 2 \\ \end{array}\right] \end{equation*}
    and compute \(AB\) and \(AC\text{.}\)
    If \(AB = AC\text{,}\) is it necessarily true that \(B = C\text{?}\)
  8. Again, with real numbers, we know that if \(ab = 0\text{,}\) then either \(a = 0\) or \(b=0\text{.}\) Define
    \begin{equation*} A = \left[\begin{array}{rr} 1 \amp 2 \\ -2 \amp -4 \\ \end{array}\right], B = \left[\begin{array}{rr} 2 \amp -4 \\ -1 \amp 2 \\ \end{array}\right] \end{equation*}
    and compute \(AB\text{.}\)
    If \(AB = 0\text{,}\) is it necessarily true that either \(A=0\) or \(B=0\text{?}\)
Answer.
  1. We define
    A = matrix(3, 4, [1, 2, 0, -1,
                      2, 4, -3, -2,
                      -1, -2, 6, 1])
    v = vector([3, 1, -1, 1])
    A*v		  
    	    
  2. We define
    A = matrix(2, 3, [-2, 0, 3, 1, 4, 2])
    zero = vector([0, 0])
    v = vector([-2, 3])
    w = vector([1, 2])
    	    
  3. \(A\zerovec = \zerovec\text{.}\)
  4. \(A(3\vvec) = 3(A\vvec)\text{.}\)
  5. \(\displaystyle A(\vvec+\wvec) = A\vvec + A\wvec\)
  1. The product \(AB\) exists because the number of columns of \(A\) equals the number of rows of \(B\text{.}\) The dimensions of \(AB\) are \(2\by 2\text{.}\)
  2. We have \(AB = \left[\begin{array}{rr} 2 \amp 4 \\ -3 \amp 9 \end{array}\right] \text{.}\)
  3. It is not generally true that \(AB=BA\text{.}\)
  4. We find that \(A(B+C)=AB + AC\text{.}\)
  5. We find that \(A(BC) = (AB)C\text{.}\)
  6. It is not generally true that \(B=C\) if \(AB=AC\text{.}\)
  7. It is not generally true that \(A=0\) or \(B=0\) if \(AB=0\text{.}\)
Solution.
  1. The product \(AB\) exists because the number of columns of \(A\) equals the number of rows of \(B\text{.}\) The dimensions of \(AB\) are \(2\by 2\text{.}\)
  2. We have \(AB = \left[\begin{array}{rr} 2 \amp 4 \\ -3 \amp 9 \end{array}\right] \text{.}\)
  3. Yes, we can form the product \(BA\) because the number of columns of \(B\) equals the number of rows of \(A\text{.}\) This product \(AB\) will be \(3\by3\text{,}\) however, so it must be true that \(AB\neq BA\text{.}\)
  4. We find that \(A(B+C)=AB + AC\text{.}\)
  5. We find that \(A(BC) = (AB)C\text{.}\)
  6. It is not generally true that \(B=C\) if \(AB=AC\text{,}\) as illustrated by this example.
  7. It is not generally true that \(A=0\) or \(B=0\) if \(AB=0\text{,}\) as illustrated by this example.
This activity demonstrated some general properties about products of matrices, which mirror some properties about operations with real numbers.

Properties of Matrix-matrix Multiplication.

If \(A\text{,}\) \(B\text{,}\) and \(C\) are matrices such that the following operations are defined, it follows that
Associativity:
\(A(BC) = (AB)C\text{.}\)
Distributivity:
\(A(B+C) = AB+AC\text{.}\)
\((A+B)C = AC+BC\text{.}\)
At the same time, there are a few properties that hold for real numbers that do not hold for matrices.

Caution.

The following properties hold for real numbers but not for matrices.
Commutativity:
It is not generally true that \(AB = BA\text{.}\)
Cancellation:
It is not generally true that \(AB = AC\) implies that \(B = C\text{.}\)
Zero divisors:
It is not generally true that \(AB = 0\) implies that either \(A=0\) or \(B=0\text{.}\)

Subsection 1.4.7 Some special types of matrices

We have already encountered the \(n\)-dimensional identity matrix \(I_n\text{.}\) This \(n \by n\) square matrix has ones along the main diagonal (where the row index and column index are the same) and zeros everywhere else. For example,
\begin{equation*} I_3 = \left[\begin{array}{rrr} 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \\ \end{array}\right] . \end{equation*}
More generally, a matrix is called diagonal if the only non-zero values are along the main diagonal. Here are some examples of diagonal matrices.
\begin{equation*} \left[\begin{array}{rrr} -2 \amp \lgray{0} \amp \lgray{0} \\ \lgray{0} \amp 4 \amp \lgray{0} \\ \lgray{0} \amp \lgray{0} \amp 3 \\ \end{array}\right], \qquad \left[\begin{array}{rrr} 0 \amp \lgray{0} \amp \lgray{0} \\ \lgray{0} \amp 4 \amp \lgray{0} \\ \lgray{0} \amp \lgray{0} \amp 3 \\ \end{array}\right], \quad \left[\begin{array}{rrr} 1 \amp \lgray{0} \amp \lgray{0} \\ \lgray{0} \amp 4 \amp \lgray{0} \\ \lgray{0} \amp \lgray{0} \amp 3 \\ \lgray{0} \amp \lgray{0} \amp \lgray{0} \\ \end{array}\right], \quad \left[\begin{array}{rrrr} 1 \amp \lgray{0} \amp \lgray{0} \amp \lgray{0} \\ \lgray{0} \amp 4 \amp \lgray{0} \amp \lgray{0} \\ \lgray{0} \amp \lgray{0} \amp 3 \amp \lgray{0} \\ \end{array}\right] \end{equation*}
Note that
  1. A diagonal matrix is allowed to have 0’s along the digonal.
  2. A diagonal matrix might not be square, but in that case all elements in the "extra" rows or columns must be 0’s. Square diagonal matrices are especially important, however.
A square matrix \(M \) for which \(M_{ij} = M_{ji}\) for every \(i\) and \(j\) is called a symmetric matrix. Here are some examples.
\begin{equation*} \left[\begin{array}{rrr} -2 \amp 1 \amp 2 \\ 1 \amp 4 \amp 5 \\ 2 \amp 5 \amp 3 \\ \end{array}\right], \quad \left[\begin{array}{rrr} 0 \amp 2 \amp 0 \\ 2 \amp 4 \amp 0 \\ 0 \amp 0 \amp 3 \\ \end{array}\right], \quad \left[\begin{array}{rrr} 1 \amp 0 \amp 0 \\ 0 \amp 4 \amp 0 \\ 0 \amp 0 \amp 3 \\ \end{array}\right] \end{equation*}
The symmetry of a symmetric matrix is symmetry across the main diagonal: If we exchange the rows and the columns, we get the same matrix. The matrix obtained by exchanging rows and columns of a matrix \(A \) is called the transpose of \(A\text{,}\) written \(A^\top\text{.}\) For formally, the transpose satisfies
\begin{equation*} A^\top_{ij} = A_{ji} \end{equation*}
for every \(i\) and \(j\text{.}\) For example,
\begin{equation*} \left[\begin{array}{rrr} -2 \amp -1 \amp 3 \\ 1 \amp 4 \amp 0 \\ 2 \amp 5 \amp 3 \\ \end{array}\right]^\top = \left[\begin{array}{rrr} -2 \amp 1 \amp 2 \\ -1 \amp 4 \amp 5 \\ 3 \amp 0 \amp 3 \\ \end{array}\right]\text{.} \end{equation*}
Square diagonal matrices are always symmetric and symmetric matrices are always square. So we will use the terms square diagonal matrix and symmetric diagonal matrix interchangeably.

Subsection 1.4.8 Summary

In this section, we have found an especially simple way to express linear systems using matrix multiplication.
  • If \(A\) is an \(m\by n\) matrix and \(\xvec\) an \(n\)-dimensional vector, then \(A\xvec\) is the linear combination of the columns of \(A\) using the components of \(\xvec\) as weights. The vector \(A\xvec\) is \(m\)-dimensional.
  • The solutions to the equation \(A\xvec = \bvec\) are the same as the solutions to the linear system using coefficients from \(A\) and \(b\text{.}\) We can represent all these coefficients in a single autmented matrix: \(\left[ \begin{array}{r|r} A \amp \bvec \end{array}\right]\text{.}\)
  • If \(A\) is an \(m\by n\) matrix and \(B\) is an \(n\by p\) matrix, we can form the product \(AB\text{.}\) \(AB\) will be an \(m\by p\) matrix. The columns of \(AB\) are the products of \(A\) and the columns of \(B\text{.}\)

Exercises 1.4.9 Exercises

1.

Suppose that \(A\) is a \(135\by2201\) matrix, and that \(\xvec\) is a vector. If \(A\xvec\) is defined, what is the dimension of \(\xvec\text{?}\) What is the dimension of \(A\xvec\text{?}\)

2.

Suppose that \(A \) is a \(3\by2\) matrix whose columns are \(\vvec_1\) and \(\vvec_2\text{;}\) that is,
\begin{equation*} A = \left[\begin{array}{rr} \vvec_1 \amp \vvec_2 \end{array} \right]\text{.} \end{equation*}
  1. What is the dimension of the vectors \(\vvec_1\) and \(\vvec_2\text{?}\)
  2. What is the product \(A\twovec{1}{0}\) in terms of \(\vvec_1\) and \(\vvec_2\text{?}\) What is the product \(A\twovec{0}{1}\text{?}\) What is the product \(A\twovec{2}{3}\text{?}\)
  3. If we know that
    \begin{equation*} A\twovec{1}{0} = \threevec{3}{-2}{1},~~~ A\twovec{0}{1} = \threevec{0}{3}{2}, \end{equation*}
    what is the matrix \(A\text{?}\)

3.

Suppose that the matrix \(A = \left[\begin{array}{rr} \vvec_1 \amp \vvec_2 \end{array}\right]\) where \(\vvec_1\) and \(\vvec_2\) are shown in Figure 1.4.10.
Figure 1.4.10. Two vectors \(\vvec_1\) and \(\vvec_2\) that form the columns of the matrix \(A\text{.}\)
  1. What is the shape of the matrix \(A\text{?}\)
  2. On Figure 1.4.10, indicate the vectors
    \begin{equation*} A\twovec{1}{0}, ~~~A\twovec{2}{3}, ~~~A\twovec{0}{-3}\text{.} \end{equation*}
  3. Find all vectors \(\xvec\) such that \(A\xvec=\bvec\text{.}\)
  4. Find all vectors \(\xvec\) such that \(A\xvec = \zerovec\text{.}\)