Ch 2FREE

Solving Linear Systems and Matrix Methods

12 min

Chapter 2: Solving Linear Systems and Matrix Methods

This chapter introduces the fundamental problem of linear algebra: solving systems of linear equations. You'll learn two complementary ways to understand these systems—through rows and columns—and discover how to express them using matrices. The chapter builds toward the powerful elimination method that systematically transforms any solvable system into a form where the answer becomes obvious.


2.1 Vectors and Linear Equations

The Central Problem

Linear algebra begins with a practical question: how do we solve systems of linear equations? Let's start with something concrete—two equations in two unknowns:

x2y=1x - 2y = 1x2y=1 3x+2y=113x + 2y = 113x+2y=11

We're looking for a point (x,y)(x, y)(x,y) that satisfies both equations simultaneously.

Two Pictures of the Same System

There are two fundamentally different ways to visualize what's happening. Each reveals important insights.

The Row Picture

In the row picture, each equation represents a straight line in the plane. When you plot both lines, their intersection point is the solution. For our example, the lines meet at (3,1)(3, 1)(3,1). This geometric view is intuitive—in 3D you'd see planes, and in higher dimensions you'd see hyperplanes.

Row Picture: Each equation aix=bia_i \cdot x = b_iaix=bi (the dot product of row iii with unknown vector xxx) describes a line (in 2D) or plane (in 3D) or hyperplane. The solution is where all these objects meet.

The Column Picture

In the column picture, we rearrange the same system as a vector equation:

x[13]+y[22]=[111]x \begin{bmatrix} 1 \\ 3 \end{bmatrix} + y \begin{bmatrix} -2 \\ 2 \end{bmatrix} = \begin{bmatrix} 1 \\ 11 \end{bmatrix}x[13]+y[22]=[111]

Now we're asking: what combination of the column vectors on the left produces the vector on the right? This is a linear combination problem. With x=3x = 3x=3 and y=1y = 1y=1, we get exactly the right side:

3[13]+1[22]=[39]+[22]=[111]3 \begin{bmatrix} 1 \\ 3 \end{bmatrix} + 1 \begin{bmatrix} -2 \\ 2 \end{bmatrix} = \begin{bmatrix} 3 \\ 9 \end{bmatrix} + \begin{bmatrix} -2 \\ 2 \end{bmatrix} = \begin{bmatrix} 1 \\ 11 \end{bmatrix}3[13]+1[22]=[39]+[22]=[111]

Column Picture: The equation Ax=bAx = bAx=b asks for a linear combination of the columns of AAA that produces bbb. We multiply the first column by x1x_1x1, the second by x2x_2x2, and so on, then add them up.

The column picture is less intuitive at first, but it's more powerful. It naturally extends to higher dimensions and reveals deep structure in the problem.

Basic Operations on Vectors

To work with vector equations, we need two operations:

Scalar Multiplication scales a vector by a number. Multiplying vector [13]\begin{bmatrix} 1 \\ 3 \end{bmatrix}[13] by 3 gives [39]\begin{bmatrix} 3 \\ 9 \end{bmatrix}[39].

Vector Addition combines two vectors component by component. Adding [39]\begin{bmatrix} 3 \\ 9 \end{bmatrix}[39] and [22]\begin{bmatrix} -2 \\ 2 \end{bmatrix}[22] gives [111]\begin{bmatrix} 1 \\ 11 \end{bmatrix}[111].

These two operations together produce linear combinations: expressions like x1a1+x2a2++xnanx_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 + \cdots + x_n \mathbf{a}_nx1a1+x2a2++xnan.

Matrix Notation

When we collect the coefficients of our system into a rectangular array, we get the coefficient matrix:

A=[1232]A = \begin{bmatrix} 1 & -2 \\ 3 & 2 \end{bmatrix}A=[1322]

This is a 2×22 \times 22×2 matrix (2 rows, 2 columns). The unknown vector is x=[xy]\mathbf{x} = \begin{bmatrix} x \\ y \end{bmatrix}x=[xy] and the right-hand side is b=[111]\mathbf{b} = \begin{bmatrix} 1 \\ 11 \end{bmatrix}b=[111].

Matrix Equation: The system Ax=bAx = bAx=b is a compact way to write all mmm equations at once. When interpreted by rows, each equation is a dot product: (rowi)x=bi(row \, i) \cdot x = b_i(rowi)x=bi. When interpreted by columns, we have a linear combination: Ax=x1(column 1)++xn(column n)=bAx = x_1(\text{column 1}) + \cdots + x_n(\text{column } n) = bAx=x1(column 1)++xn(column n)=b.

Matrix Multiplication: Two Interpretations

Understanding matrix multiplication is crucial. There are two equally valid ways to compute AxAxAx:

By rows: Each component of AxAxAx comes from the dot product of that row with x\mathbf{x}x.

Ax=[(row1)x(row2)x(rowm)x]Ax = \begin{bmatrix} (row \, 1) \cdot \mathbf{x} \\ (row \, 2) \cdot \mathbf{x} \\ \vdots \\ (row \, m) \cdot \mathbf{x} \end{bmatrix}Ax=(row1)x(row2)x(rowm)x

By columns: AxAxAx is a linear combination of the columns of AAA, with coefficients from x\mathbf{x}x.

Ax=x1(column 1)+x2(column 2)++xn(column n)Ax = x_1(\text{column 1}) + x_2(\text{column 2}) + \cdots + x_n(\text{column } n)Ax=x1(column 1)+x2(column 2)++xn(column n)

For our example with x=3,y=1x = 3, y = 1x=3,y=1:

  • Row interpretation: (1)(3)+(2)(1)=1(1)(3) + (-2)(1) = 1(1)(3)+(2)(1)=1 and (3)(3)+(2)(1)=11(3)(3) + (2)(1) = 11(3)(3)+(2)(1)=11.
  • Column interpretation: 3[13]+1[22]=[111]3\begin{bmatrix} 1 \\ 3 \end{bmatrix} + 1\begin{bmatrix} -2 \\ 2 \end{bmatrix} = \begin{bmatrix} 1 \\ 11 \end{bmatrix}3[13]+1[22]=[111].

Both give the same answer. The row interpretation is computational. The column interpretation is conceptual.

Three Dimensions and Beyond

With three equations and three unknowns, the row picture shows three planes in 3D space. They normally intersect at a single point (if a solution exists). The column picture asks whether the right-hand side b\mathbf{b}b can be written as a combination of three column vectors.

When we move to higher dimensions, the row picture becomes impossible to visualize, but the column picture's logic remains unchanged: we're still asking whether a vector lives in the space spanned by certain columns.

📝 Section Recap: A system of linear equations can be visualized in two ways. The row picture shows nnn-dimensional hyperplanes meeting at a point. The column picture asks whether a vector b\mathbf{b}b is a linear combination of the columns of matrix AAA. Matrix notation Ax=bAx = bAx=b encodes the entire system compactly. Matrix-vector multiplication can be computed row-by-row (dot products) or column-by-column (linear combination)—both give the same result. This dual perspective is essential to understanding linear algebra.


2.2 The Idea of Elimination

The Elimination Method

The most practical way to solve linear systems is elimination. The goal is to transform the system into upper triangular form—where unknowns drop out one by one, and the last equation contains only the last unknown.

Let's see how this works on our original system:

Before elimination: x2y=1x - 2y = 1x2y=1 3x+2y=113x + 2y = 113x+2y=11

To eliminate xxx from the second equation, we subtract 3 times the first equation from the second. Since the first equation has coefficient xxx (which is 1), the multiplier is 31=3\frac{3}{1} = 313=3.

After elimination: x2y=1x - 2y = 1x2y=1 8y=8(gives y=1)8y = 8 \quad \text{(gives } y = 1\text{)}8y=8(gives y=1)

Now we solve by back substitution: from the second equation, y=1y = 1y=1. Substituting into the first: x2=1x - 2 = 1x2=1, so x=3x = 3x=3.

This process generalizes beautifully. For a system with nnn equations and nnn unknowns:

  1. Use the first equation to eliminate the first unknown from all equations below it.
  2. Use the new second equation to eliminate the second unknown from all equations below it.
  3. Continue until you reach upper triangular form.
  4. Solve by back substitution.

The Pivot and the Multiplier

The pivot is the entry we use to eliminate—the leading nonzero coefficient in the equation we're using as the "driver." In our example, the first pivot is 1 (the coefficient of xxx in the first equation).

The multiplier is the factor by which we scale the pivot equation before subtracting. It's computed as:

Multiplier=entry to eliminatepivot\text{Multiplier} = \frac{\text{entry to eliminate}}{\text{pivot}}Multiplier=pivotentry to eliminate

In our example, the multiplier is 21=31=3\ell_{21} = \frac{3}{1} = 321=13=3.

Pivot and Multiplier: The pivot is the first nonzero entry in the row used for elimination. The multiplier is the ratio: (entry to eliminate) / (pivot). Pivots must be nonzero—if the pivot position contains zero, we exchange equations to move a nonzero entry into that position.

Three Equations in Three Unknowns

Consider: 2x+4y2z=22x + 4y - 2z = 22x+4y2z=2 4x+9y3z=84x + 9y - 3z = 84x+9y3z=8 2x3y+7z=10-2x - 3y + 7z = 102x3y+7z=10

Step 1: The first pivot is 2. Eliminate xxx from equations 2 and 3.

  • Multiplier for row 2: 21=4/2=2\ell_{21} = 4/2 = 221=4/2=2. Subtract 2 times row 1 from row 2.
  • Multiplier for row 3: 31=2/2=1\ell_{31} = -2/2 = -131=2/2=1. Subtract -1 times row 1 from row 3 (i.e., add row 1 to row 3).

After Step 1: 2x+4y2z=22x + 4y - 2z = 22x+4y2z=2 y+z=4y + z = 4y+z=4 y+5z=12y + 5z = 12y+5z=12

Step 2: The second pivot is 1 (the coefficient of yyy in the new row 2). Eliminate yyy from row 3.

  • Multiplier for row 3: 32=1/1=1\ell_{32} = 1/1 = 132=1/1=1. Subtract row 2 from row 3.

After Step 2 (upper triangular form): 2x+4y2z=22x + 4y - 2z = 22x+4y2z=2 y+z=4y + z = 4y+z=4 4z=84z = 84z=8

Back substitution: From row 3: z=2z = 2z=2. From row 2: y+2=4y + 2 = 4y+2=4, so y=2y = 2y=2. From row 1: 2x+84=22x + 8 - 4 = 22x+84=2, so x=1x = -1x=1.

The solution is (x,y,z)=(1,2,2)(x, y, z) = (-1, 2, 2)(x,y,z)=(1,2,2).

When Elimination Fails

Elimination is rock-solid when every pivot position contains a nonzero entry. But three things can go wrong:

1. No solution occurs when the equations are inconsistent. For example: x2y=1x - 2y = 1x2y=1 0y=80y = 80y=8

There's no way to satisfy the second equation. The row picture shows two parallel lines that never meet.

2. Infinitely many solutions occur when one equation is redundant (a multiple of another). For example: x2y=1x - 2y = 1x2y=1 0y=00y = 00y=0

The second equation places no constraint. Any point on the line x2y=1x - 2y = 1x2y=1 is a solution. The value of yyy is "free"—we can choose it arbitrarily, and xxx adjusts to match.

3. Zero in the pivot position prevents immediate elimination, but we can often fix it with a row exchange. If row iii has a zero in the pivot position but a nonzero entry appears below it, we swap those rows.

Elimination that reaches a full set of pivots is called nonsingular, and it guarantees a unique solution.

📝 Section Recap: The elimination method transforms a system into upper triangular form through systematic row operations. The pivot is the entry used for elimination; the multiplier is the ratio of the entry to eliminate to the pivot. Back substitution then solves the triangular system efficiently. Elimination can fail if a pivot position contains zero (solved by row exchange), if the system is inconsistent (no solution), or if equations are redundant (infinitely many solutions). The geometric view: in the row picture, planes either meet at a point (unique solution), along a line (infinitely many), or don't meet (no solution). In the column picture, b\mathbf{b}b either lies uniquely as a combination of the columns, lies in the span with multiple combinations, or lies outside the span entirely.


2.3 Elimination Using Matrices

Matrix Representation of Elimination Steps

Each step of elimination can be represented by multiplying by a matrix. This transforms the system Ax=bAx = bAx=b into an upper triangular system Ux=cUx = cUx=c, where we can apply back substitution.

The elimination matrix EijE_{ij}Eij performs a single elimination step: it subtracts \ell times row jjj from row iii. The matrix has the form:

  • Start with the identity matrix III.
  • Change the (i,j)(i, j)(i,j) entry from 0 to -\ell.

For example, to subtract 2 times row 1 from row 2:

E21=[100210001]E_{21} = \begin{bmatrix} 1 & 0 & 0 \\ -2 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}E21=120010001

When we multiply E21AE_{21}AE21A, we get a new matrix with 2x2x2x eliminated from row 2.

The Power of Matrix Multiplication

Once we think in terms of matrices, elimination becomes elegant. If E1,E2,E3,E_1, E_2, E_3, \ldotsE1,E2,E3, are the elimination matrices for successive steps, then:

EnEn1E2E1A=UE_n E_{n-1} \cdots E_2 E_1 A = UEnEn1E2E1A=U

where UUU is the upper triangular result. And simultaneously:

EnEn1E2E1b=cE_n E_{n-1} \cdots E_2 E_1 b = cEnEn1E2E1b=c

This tells us that the overall transformation is a single matrix multiplication by the product E=EnEn1E2E1E = E_n E_{n-1} \cdots E_2 E_1E=EnEn1E2E1.

Elimination as Matrix Multiplication: Each row operation can be expressed as left-multiplication by an elementary matrix. Stringing together all the elimination steps gives EA=UEA = UEA=U and Eb=cEb = cEb=c, so we've transformed Ax=bAx = bAx=b into the triangular system Ux=cUx = cUx=c.

The Augmented Matrix

In practice, we don't track the elimination matrices separately. Instead, we work with the augmented matrix [Ab][A \, | \, b][Ab], which combines the coefficient matrix and right-hand side into one rectangular array:

[Ab]=[2422493823710][A \, | \, b] = \begin{bmatrix} 2 & 4 & -2 & | & 2 \\ 4 & 9 & -3 & | & 8 \\ -2 & -3 & 7 & | & 10 \end{bmatrix}[Ab]=2424932372810

When we perform row operations, both the left (matrix AAA) and right (vector bbb) sides are transformed together. This ensures that we're solving the correct system throughout.

Permutation Matrices and Row Exchanges

Sometimes we need to rearrange the order of equations to move a nonzero entry into the pivot position. A permutation matrix PPP does exactly this.

For example, to exchange rows 2 and 3 of any 3×33 \times 33×3 matrix, multiply by:

P23=[100001010]P_{23} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{bmatrix}P23=100001010

Multiplying P23P_{23}P23 times a matrix swaps its rows 2 and 3.

Row Exchange Matrix: A permutation matrix PijP_{ij}Pij is the identity matrix with rows iii and jjj exchanged. Multiplying PijP_{ij}Pij on the left exchanges rows iii and jjj of any matrix.

Identity, Elementary, and Permutation Matrices

Three special matrices appear constantly in elimination:

Identity Matrix: The n×nn \times nn×n matrix I=[e1,,en]I = [e_1, \ldots, e_n]I=[e1,,en] has 1's on the diagonal and 0's elsewhere. Ix=xIx = xIx=x for any vector xxx.

Elementary Matrix or Elimination Matrix: EijE_{ij}Eij is the identity with one additional nonzero entry -\ell in position (i,j)(i, j)(i,j). It subtracts \ell times row jjj from row iii. EijE_{ij}Eij subtracts a multiple of one row from another.

Permutation Matrix: PijP_{ij}Pij is the identity with rows iii and jjj exchanged. PijP_{ij}Pij swaps rows iii and jjj of any matrix it multiplies.

Matrix Multiplication and Associativity

A key property is that matrix multiplication is associative:

(AB)C=A(BC)(AB)C = A(BC)(AB)C=A(BC)

This is why we can rearrange products of elimination matrices. If we first apply E2E_2E2 and then E1E_1E1, the combined effect is E1E2E_1 E_2E1E2—and the parentheses don't matter.

Warning: Matrix multiplication is NOT commutative. Usually ABBAAB \neq BAAB=BA. When EEE subtracts 2 times row 1 from row 2, we write EAEAEA. If we multiply on the right instead (AEAEAE), the result is different—now columns are modified, not rows. To apply row operations, we always multiply on the left.

📝 Section Recap: Elimination can be expressed as matrix multiplication. Each row operation is an elementary matrix multiplying on the left: subtracting \ell times row jjj from row iii corresponds to EijAE_{ij}AEijA. All elimination steps combine into a single matrix EEE such that EA=UEA = UEA=U (upper triangular). We solve Ax=bAx = bAx=b by computing Ux=EbUx = EbUx=Eb, then using back substitution. The augmented matrix [Ab][A | b][Ab] tracks both the coefficient matrix and right-hand side together. Permutation matrices exchange rows when needed to place nonzero pivots on the diagonal. Matrix multiplication is associative (so (AB)C=A(BC)(AB)C = A(BC)(AB)C=A(BC)) but not commutative (usually ABBAAB \neq BAAB=BA).


2.4 Rules for Matrix Operations

Basic Facts About Matrices

A matrix is a rectangular array of numbers arranged in rows and columns. An m×nm \times nm×n matrix has mmm rows and nnn columns.

Matrix Addition: Matrices of the same size can be added entry-by-entry. We can scale any matrix by multiplying each entry by a constant ccc. The zero matrix (all entries zero) plus any matrix gives that matrix back. Multiplying a matrix by 1-11 reverses all signs.

Matrix addition works like vector addition—intuitive and component-wise. Matrix multiplication is trickier.

When Can We Multiply Matrices?

To multiply matrices AAA and BBB, the number of columns of AAA must equal the number of rows of BBB:

Am×n times Bn×p=Cm×pA_{m \times n} \text{ times } B_{n \times p} = C_{m \times p}Am×n times Bn×p=Cm×p

The resulting matrix CCC has mmm rows and ppp columns.

Fundamental Law of Matrix Multiplication: For matrices to be compatible for multiplication: AAA has nnn columns, BBB has nnn rows, and AB=CAB = CAB=C has the same number of rows as AAA and the same number of columns as BBB. We can compute ABABAB in three ways: (1) rows of AAA times BBB, (2) AAA times columns of BBB, or (3) columns of AAA times rows of BBB.

Computing Matrix Products

Let's compute ABABAB where A=[2314]A = \begin{bmatrix} 2 & 3 \\ 1 & 4 \end{bmatrix}A=[2134] and B=[102310]B = \begin{bmatrix} 1 & 0 & 2 \\ 3 & 1 & 0 \end{bmatrix}B=[130120].

By rows: Each row of ABABAB is that row of AAA times all of BBB:

  • Row 1 of ABABAB: (2,3)[102310]=(11,3,4)(2, 3) \begin{bmatrix} 1 & 0 & 2 \\ 3 & 1 & 0 \end{bmatrix} = (11, 3, 4)(2,3)[130120]=(11,3,4)
  • Row 2 of ABABAB: (1,4)[102310]=(13,4,2)(1, 4) \begin{bmatrix} 1 & 0 & 2 \\ 3 & 1 & 0 \end{bmatrix} = (13, 4, 2)(1,4)[130120]=(13,4,2)

By columns: Each column of ABABAB is AAA times that column of BBB:

  • Column 1 of ABABAB: A[13]=[1113]A \begin{bmatrix} 1 \\ 3 \end{bmatrix} = \begin{bmatrix} 11 \\ 13 \end{bmatrix}A[13]=[1113]
  • Column 2 of ABABAB: A[01]=[34]A \begin{bmatrix} 0 \\ 1 \end{bmatrix} = \begin{bmatrix} 3 \\ 4 \end{bmatrix}A[01]=[34]
  • Column 3 of ABABAB: A[20]=[42]A \begin{bmatrix} 2 \\ 0 \end{bmatrix} = \begin{bmatrix} 4 \\ 2 \end{bmatrix}A[20]=[42]

Both give: AB=[11341342]AB = \begin{bmatrix} 11 & 3 & 4 \\ 13 & 4 & 2 \end{bmatrix}AB=[11133442]

Key Properties

Associative Law: (AB)C=A(BC)(AB)C = A(BC)(AB)C=A(BC). The order of operations doesn't matter; you can rearrange parentheses. This is crucial for elimination, where we compose multiple row operations.

Non-Commutativity: In general, ABBAAB \neq BAAB=BA. Even if both products are defined, they usually give different answers. This is fundamentally different from arithmetic with numbers.

Distributive Law: A(B+C)=AB+ACA(B + C) = AB + ACA(B+C)=AB+AC and (A+B)C=AC+BC(A + B)C = AC + BC(A+B)C=AC+BC. These work as you'd expect.

Block Multiplication: Matrices can be partitioned into blocks and multiplied as though the blocks were entries. If A=[A1A2]A = [A_1 \, | \, A_2]A=[A1A2] (two-column block matrix) and B=[B1B2]B = \begin{bmatrix} B_1 \\ B_2 \end{bmatrix}B=[B1B2] (two-row block matrix), then AB=A1B1+A2B2AB = A_1 B_1 + A_2 B_2AB=A1B1+A2B2, provided the block dimensions are compatible.

An Important Warning

The identity matrix III is special:

Identity Matrix: III is the matrix with 1's on the diagonal and 0's elsewhere. For any matrix AAA, we have IA=AIA = AIA=A and AI=AAI = AAI=A. It's the multiplicative identity for matrices, just as 1 is for numbers.

There are many matrices for which no inverse exists, making matrix algebra fundamentally different from arithmetic. A matrix that has an inverse is called invertible or nonsingular.

📝 Section Recap: Matrix addition is entry-wise and works only for matrices of the same size. Matrix multiplication requires the number of columns of the left matrix to equal the number of rows of the right matrix. The (i,j)(i, j)(i,j) entry of ABABAB is the dot product of row iii of AAA with column jjj of BBB. Matrix multiplication can be computed row-by-row, column-by-column, or block-by-block. Multiplication is associative (allowing us to rearrange parentheses) but not commutative (usually ABBAAB \neq BAAB=BA). The identity matrix III satisfies IA=AIA = AIA=A and AI=AAI = AAI=A. Many matrices lack inverses, a key difference from arithmetic with numbers.


Key Concepts Summary

Two Views of Linear Systems:

  • Row picture: nnn equations give nnn hyperplanes; the solution is where they meet.
  • Column picture: Combine columns of AAA to produce the vector b\mathbf{b}b.

Elimination Method:

  • Transform Ax=bAx = bAx=b into upper triangular form Ux=cUx = cUx=c via row operations.
  • Solve by back substitution, starting from the last equation.
  • Pivots must be nonzero; exchange rows if needed.

Matrix Language:

  • Each elimination step is a matrix multiplication by an elementary matrix.
  • The augmented matrix [Ab][A | b][Ab] keeps the system and right-hand side in sync.
  • Permutation matrices exchange rows; the identity matrix is multiplicative identity.

Matrix Multiplication:

  • ABABAB requires columns of AAA = rows of BBB.
  • Compute by rows (Ax=bAx = bAx=b row-by-row), by columns (Ax=bAx = bAx=b column-by-column), or by blocks.
  • Multiplication is associative but not commutative; most matrices lack inverses.

Study Tips

  1. Visualize both pictures: For 2D systems, sketch the lines. Understand that a line is the intersection of a row's equation with the plane.

  2. Practice elimination by hand: Perform elimination on 2×22 \times 22×2 and 3×33 \times 33×3 systems until the process becomes automatic. Identify the pivots and multipliers.

  3. Understand matrix notation: Rewrite each system as Ax=bAx = bAx=b. Recognize the columns of AAA and see how they combine to give b\mathbf{b}b.

  4. Verify your solutions: After solving, substitute back into the original equations to confirm.

  5. Connect rows and columns: For every statement about the row picture, there's a dual statement about the column picture. This duality is the heart of linear algebra.

  6. Remember the failure modes: No solution (inconsistent system, parallel planes), infinitely many solutions (redundant equation, free variables), and zero pivot (exchange rows).

  7. Use matrix products correctly: When solving Ax=bAx = bAx=b using elimination matrices, remember that we multiply on the left: EA=UEA = UEA=U, not on the right.