Chapter 2: Solving Linear Systems and Matrix Methods
This chapter introduces the fundamental problem of linear algebra: solving systems of linear equations. You'll learn two complementary ways to understand these systems—through rows and columns—and discover how to express them using matrices. The chapter builds toward the powerful elimination method that systematically transforms any solvable system into a form where the answer becomes obvious.
2.1 Vectors and Linear Equations
The Central Problem
Linear algebra begins with a practical question: how do we solve systems of linear equations? Let's start with something concrete—two equations in two unknowns:
x−2y=1 3x+2y=11
We're looking for a point (x,y) that satisfies both equations simultaneously.
Two Pictures of the Same System
There are two fundamentally different ways to visualize what's happening. Each reveals important insights.
The Row Picture
In the row picture, each equation represents a straight line in the plane. When you plot both lines, their intersection point is the solution. For our example, the lines meet at (3,1). This geometric view is intuitive—in 3D you'd see planes, and in higher dimensions you'd see hyperplanes.
Row Picture: Each equation ai⋅x=bi (the dot product of row i with unknown vector x) describes a line (in 2D) or plane (in 3D) or hyperplane. The solution is where all these objects meet.
The Column Picture
In the column picture, we rearrange the same system as a vector equation:
x[13]+y[−22]=[111]
Now we're asking: what combination of the column vectors on the left produces the vector on the right? This is a linear combination problem. With x=3 and y=1, we get exactly the right side:
3[13]+1[−22]=[39]+[−22]=[111]
Column Picture: The equation Ax=b asks for a linear combination of the columns of A that produces b. We multiply the first column by x1, the second by x2, and so on, then add them up.
The column picture is less intuitive at first, but it's more powerful. It naturally extends to higher dimensions and reveals deep structure in the problem.
Basic Operations on Vectors
To work with vector equations, we need two operations:
Scalar Multiplication scales a vector by a number. Multiplying vector [13] by 3 gives [39].
Vector Addition combines two vectors component by component. Adding [39] and [−22] gives [111].
These two operations together produce linear combinations: expressions like x1a1+x2a2+⋯+xnan.
Matrix Notation
When we collect the coefficients of our system into a rectangular array, we get the coefficient matrix:
A=[13−22]
This is a 2×2 matrix (2 rows, 2 columns). The unknown vector is x=[xy] and the right-hand side is b=[111].
Matrix Equation: The system Ax=b is a compact way to write all m equations at once. When interpreted by rows, each equation is a dot product: (rowi)⋅x=bi. When interpreted by columns, we have a linear combination: Ax=x1(column 1)+⋯+xn(column n)=b.
Matrix Multiplication: Two Interpretations
Understanding matrix multiplication is crucial. There are two equally valid ways to compute Ax:
By rows: Each component of Ax comes from the dot product of that row with x.
Ax=(row1)⋅x(row2)⋅x⋮(rowm)⋅x
By columns: Ax is a linear combination of the columns of A, with coefficients from x.
Ax=x1(column 1)+x2(column 2)+⋯+xn(column n)
For our example with x=3,y=1:
- Row interpretation: (1)(3)+(−2)(1)=1 and (3)(3)+(2)(1)=11.
- Column interpretation: 3[13]+1[−22]=[111].
Both give the same answer. The row interpretation is computational. The column interpretation is conceptual.
Three Dimensions and Beyond
With three equations and three unknowns, the row picture shows three planes in 3D space. They normally intersect at a single point (if a solution exists). The column picture asks whether the right-hand side b can be written as a combination of three column vectors.
When we move to higher dimensions, the row picture becomes impossible to visualize, but the column picture's logic remains unchanged: we're still asking whether a vector lives in the space spanned by certain columns.
📝 Section Recap: A system of linear equations can be visualized in two ways. The row picture shows n-dimensional hyperplanes meeting at a point. The column picture asks whether a vector b is a linear combination of the columns of matrix A. Matrix notation Ax=b encodes the entire system compactly. Matrix-vector multiplication can be computed row-by-row (dot products) or column-by-column (linear combination)—both give the same result. This dual perspective is essential to understanding linear algebra.
2.2 The Idea of Elimination
The Elimination Method
The most practical way to solve linear systems is elimination. The goal is to transform the system into upper triangular form—where unknowns drop out one by one, and the last equation contains only the last unknown.
Let's see how this works on our original system:
Before elimination: x−2y=1 3x+2y=11
To eliminate x from the second equation, we subtract 3 times the first equation from the second. Since the first equation has coefficient x (which is 1), the multiplier is 13=3.
After elimination: x−2y=1 8y=8(gives y=1)
Now we solve by back substitution: from the second equation, y=1. Substituting into the first: x−2=1, so x=3.
This process generalizes beautifully. For a system with n equations and n unknowns:
- Use the first equation to eliminate the first unknown from all equations below it.
- Use the new second equation to eliminate the second unknown from all equations below it.
- Continue until you reach upper triangular form.
- Solve by back substitution.
The Pivot and the Multiplier
The pivot is the entry we use to eliminate—the leading nonzero coefficient in the equation we're using as the "driver." In our example, the first pivot is 1 (the coefficient of x in the first equation).
The multiplier is the factor by which we scale the pivot equation before subtracting. It's computed as:
Multiplier=pivotentry to eliminate
In our example, the multiplier is ℓ21=13=3.
Pivot and Multiplier: The pivot is the first nonzero entry in the row used for elimination. The multiplier is the ratio: (entry to eliminate) / (pivot). Pivots must be nonzero—if the pivot position contains zero, we exchange equations to move a nonzero entry into that position.
Three Equations in Three Unknowns
Consider: 2x+4y−2z=2 4x+9y−3z=8 −2x−3y+7z=10
Step 1: The first pivot is 2. Eliminate x from equations 2 and 3.
- Multiplier for row 2: ℓ21=4/2=2. Subtract 2 times row 1 from row 2.
- Multiplier for row 3: ℓ31=−2/2=−1. Subtract -1 times row 1 from row 3 (i.e., add row 1 to row 3).
After Step 1: 2x+4y−2z=2 y+z=4 y+5z=12
Step 2: The second pivot is 1 (the coefficient of y in the new row 2). Eliminate y from row 3.
- Multiplier for row 3: ℓ32=1/1=1. Subtract row 2 from row 3.
After Step 2 (upper triangular form): 2x+4y−2z=2 y+z=4 4z=8
Back substitution: From row 3: z=2. From row 2: y+2=4, so y=2. From row 1: 2x+8−4=2, so x=−1.
The solution is (x,y,z)=(−1,2,2).
When Elimination Fails
Elimination is rock-solid when every pivot position contains a nonzero entry. But three things can go wrong:
1. No solution occurs when the equations are inconsistent. For example: x−2y=1 0y=8
There's no way to satisfy the second equation. The row picture shows two parallel lines that never meet.
2. Infinitely many solutions occur when one equation is redundant (a multiple of another). For example: x−2y=1 0y=0
The second equation places no constraint. Any point on the line x−2y=1 is a solution. The value of y is "free"—we can choose it arbitrarily, and x adjusts to match.
3. Zero in the pivot position prevents immediate elimination, but we can often fix it with a row exchange. If row i has a zero in the pivot position but a nonzero entry appears below it, we swap those rows.
Elimination that reaches a full set of pivots is called nonsingular, and it guarantees a unique solution.
📝 Section Recap: The elimination method transforms a system into upper triangular form through systematic row operations. The pivot is the entry used for elimination; the multiplier is the ratio of the entry to eliminate to the pivot. Back substitution then solves the triangular system efficiently. Elimination can fail if a pivot position contains zero (solved by row exchange), if the system is inconsistent (no solution), or if equations are redundant (infinitely many solutions). The geometric view: in the row picture, planes either meet at a point (unique solution), along a line (infinitely many), or don't meet (no solution). In the column picture, b either lies uniquely as a combination of the columns, lies in the span with multiple combinations, or lies outside the span entirely.
2.3 Elimination Using Matrices
Matrix Representation of Elimination Steps
Each step of elimination can be represented by multiplying by a matrix. This transforms the system Ax=b into an upper triangular system Ux=c, where we can apply back substitution.
The elimination matrix Eij performs a single elimination step: it subtracts ℓ times row j from row i. The matrix has the form:
- Start with the identity matrix I.
- Change the (i,j) entry from 0 to −ℓ.
For example, to subtract 2 times row 1 from row 2:
E21=1−20010001
When we multiply E21A, we get a new matrix with 2x eliminated from row 2.
The Power of Matrix Multiplication
Once we think in terms of matrices, elimination becomes elegant. If E1,E2,E3,… are the elimination matrices for successive steps, then:
EnEn−1⋯E2E1A=U
where U is the upper triangular result. And simultaneously:
EnEn−1⋯E2E1b=c
This tells us that the overall transformation is a single matrix multiplication by the product E=EnEn−1⋯E2E1.
Elimination as Matrix Multiplication: Each row operation can be expressed as left-multiplication by an elementary matrix. Stringing together all the elimination steps gives EA=U and Eb=c, so we've transformed Ax=b into the triangular system Ux=c.
The Augmented Matrix
In practice, we don't track the elimination matrices separately. Instead, we work with the augmented matrix [A∣b], which combines the coefficient matrix and right-hand side into one rectangular array:
[A∣b]=24−249−3−2−37∣∣∣2810
When we perform row operations, both the left (matrix A) and right (vector b) sides are transformed together. This ensures that we're solving the correct system throughout.
Permutation Matrices and Row Exchanges
Sometimes we need to rearrange the order of equations to move a nonzero entry into the pivot position. A permutation matrix P does exactly this.
For example, to exchange rows 2 and 3 of any 3×3 matrix, multiply by:
P23=100001010
Multiplying P23 times a matrix swaps its rows 2 and 3.
Row Exchange Matrix: A permutation matrix Pij is the identity matrix with rows i and j exchanged. Multiplying Pij on the left exchanges rows i and j of any matrix.
Identity, Elementary, and Permutation Matrices
Three special matrices appear constantly in elimination:
Identity Matrix: The n×n matrix I=[e1,…,en] has 1's on the diagonal and 0's elsewhere. Ix=x for any vector x.
Elementary Matrix or Elimination Matrix: Eij is the identity with one additional nonzero entry −ℓ in position (i,j). It subtracts ℓ times row j from row i. Eij subtracts a multiple of one row from another.
Permutation Matrix: Pij is the identity with rows i and j exchanged. Pij swaps rows i and j of any matrix it multiplies.
Matrix Multiplication and Associativity
A key property is that matrix multiplication is associative:
(AB)C=A(BC)
This is why we can rearrange products of elimination matrices. If we first apply E2 and then E1, the combined effect is E1E2—and the parentheses don't matter.
Warning: Matrix multiplication is NOT commutative. Usually AB=BA. When E subtracts 2 times row 1 from row 2, we write EA. If we multiply on the right instead (AE), the result is different—now columns are modified, not rows. To apply row operations, we always multiply on the left.
📝 Section Recap: Elimination can be expressed as matrix multiplication. Each row operation is an elementary matrix multiplying on the left: subtracting ℓ times row j from row i corresponds to EijA. All elimination steps combine into a single matrix E such that EA=U (upper triangular). We solve Ax=b by computing Ux=Eb, then using back substitution. The augmented matrix [A∣b] tracks both the coefficient matrix and right-hand side together. Permutation matrices exchange rows when needed to place nonzero pivots on the diagonal. Matrix multiplication is associative (so (AB)C=A(BC)) but not commutative (usually AB=BA).
2.4 Rules for Matrix Operations
Basic Facts About Matrices
A matrix is a rectangular array of numbers arranged in rows and columns. An m×n matrix has m rows and n columns.
Matrix Addition: Matrices of the same size can be added entry-by-entry. We can scale any matrix by multiplying each entry by a constant c. The zero matrix (all entries zero) plus any matrix gives that matrix back. Multiplying a matrix by −1 reverses all signs.
Matrix addition works like vector addition—intuitive and component-wise. Matrix multiplication is trickier.
When Can We Multiply Matrices?
To multiply matrices A and B, the number of columns of A must equal the number of rows of B:
Am×n times Bn×p=Cm×p
The resulting matrix C has m rows and p columns.
Fundamental Law of Matrix Multiplication: For matrices to be compatible for multiplication: A has n columns, B has n rows, and AB=C has the same number of rows as A and the same number of columns as B. We can compute AB in three ways: (1) rows of A times B, (2) A times columns of B, or (3) columns of A times rows of B.
Computing Matrix Products
Let's compute AB where A=[2134] and B=[130120].
By rows: Each row of AB is that row of A times all of B:
- Row 1 of AB: (2,3)[130120]=(11,3,4)
- Row 2 of AB: (1,4)[130120]=(13,4,2)
By columns: Each column of AB is A times that column of B:
- Column 1 of AB: A[13]=[1113]
- Column 2 of AB: A[01]=[34]
- Column 3 of AB: A[20]=[42]
Both give: AB=[11133442]
Key Properties
Associative Law: (AB)C=A(BC). The order of operations doesn't matter; you can rearrange parentheses. This is crucial for elimination, where we compose multiple row operations.
Non-Commutativity: In general, AB=BA. Even if both products are defined, they usually give different answers. This is fundamentally different from arithmetic with numbers.
Distributive Law: A(B+C)=AB+AC and (A+B)C=AC+BC. These work as you'd expect.
Block Multiplication: Matrices can be partitioned into blocks and multiplied as though the blocks were entries. If A=[A1∣A2] (two-column block matrix) and B=[B1B2] (two-row block matrix), then AB=A1B1+A2B2, provided the block dimensions are compatible.
An Important Warning
The identity matrix I is special:
Identity Matrix: I is the matrix with 1's on the diagonal and 0's elsewhere. For any matrix A, we have IA=A and AI=A. It's the multiplicative identity for matrices, just as 1 is for numbers.
There are many matrices for which no inverse exists, making matrix algebra fundamentally different from arithmetic. A matrix that has an inverse is called invertible or nonsingular.
📝 Section Recap: Matrix addition is entry-wise and works only for matrices of the same size. Matrix multiplication requires the number of columns of the left matrix to equal the number of rows of the right matrix. The (i,j) entry of AB is the dot product of row i of A with column j of B. Matrix multiplication can be computed row-by-row, column-by-column, or block-by-block. Multiplication is associative (allowing us to rearrange parentheses) but not commutative (usually AB=BA). The identity matrix I satisfies IA=A and AI=A. Many matrices lack inverses, a key difference from arithmetic with numbers.
Key Concepts Summary
Two Views of Linear Systems:
- Row picture: n equations give n hyperplanes; the solution is where they meet.
- Column picture: Combine columns of A to produce the vector b.
Elimination Method:
- Transform Ax=b into upper triangular form Ux=c via row operations.
- Solve by back substitution, starting from the last equation.
- Pivots must be nonzero; exchange rows if needed.
Matrix Language:
- Each elimination step is a matrix multiplication by an elementary matrix.
- The augmented matrix [A∣b] keeps the system and right-hand side in sync.
- Permutation matrices exchange rows; the identity matrix is multiplicative identity.
Matrix Multiplication:
- AB requires columns of A = rows of B.
- Compute by rows (Ax=b row-by-row), by columns (Ax=b column-by-column), or by blocks.
- Multiplication is associative but not commutative; most matrices lack inverses.
Study Tips
-
Visualize both pictures: For 2D systems, sketch the lines. Understand that a line is the intersection of a row's equation with the plane.
-
Practice elimination by hand: Perform elimination on 2×2 and 3×3 systems until the process becomes automatic. Identify the pivots and multipliers.
-
Understand matrix notation: Rewrite each system as Ax=b. Recognize the columns of A and see how they combine to give b.
-
Verify your solutions: After solving, substitute back into the original equations to confirm.
-
Connect rows and columns: For every statement about the row picture, there's a dual statement about the column picture. This duality is the heart of linear algebra.
-
Remember the failure modes: No solution (inconsistent system, parallel planes), infinitely many solutions (redundant equation, free variables), and zero pivot (exchange rows).
-
Use matrix products correctly: When solving Ax=b using elimination matrices, remember that we multiply on the left: EA=U, not on the right.