PrefaceWhat’s different in the data science edition?
David Austin generously made the source of his Understanding Linear Algebra publicly available and licensed so that others could make modifications. The Data Science Edition is one example. You may be wondering how the two compare and which is the right version for you. Here is a summary of differences between the two editions.
Python instead of Sage.
This is probably the biggest and most noticeable change. This edition uses Python and important Python libraries for data science, like numpy, scipy, and pandas. Data scientists are much more likely to encounter these tools than they are to use Sage.
A different starting point.
The original version motivates linear algebra through an attempt to solve linear systems of equations. That is an important application, but why do we want to solve linear systems in the first place? And how is that related to data science?
This edition begins by exploring how vectors and matrices can be used to store data and emphasizing three complementary ways to think about vectors, which we might call the data science perspective, the geometry perspective, and the mathematical perspective. Right from the start, we want to develop skill in moving among these three ways of thinking.
Some additional data science applications.
Linear models make their first appearance much earlier -- as an example of linear combinations and matrix-vector multiplication, even though least squares methods don’t come until later. Tensors (multi-dimensional arrays) are introduced immediately after matrices, in part because they help demystify how numpy approaches things like aggregation with matrices, and in part because many data science applications make use of higher-dimensional arrays.
Over time, I hope to add additional data science examples.
A different approach to the dot product.
The traditional approach begins with a computational form and links this to geometry using the Law of Cosines, a result that is unfamiliar to many students. The result is that the computational and geometric forms can seem unrelated, and projections can feel a bit mysterious.
Here we motivate dot products from a desire to compute projections and establish the connection to the computational formala without citing the Law of Cosines. Our hope is that this will make these topics feel more natural and intuitive. In the end, we will end in the same place, relying on the important interplay between three ways of thinking: data-centric, geometric, and mathematical.