We will solve systems of linear equations algebraically using the elimination method. In other words, we will combine the equations in various ways to try to eliminate as many variables as possible from each equation. There are three valid operations we can perform on our system of equations:
Scaling: we can multiply both sides of an equation by a nonzero number.
Replacement: we can add a multiple of one equation to another, replacing the second equation with the result.
Solving equations by elimination requires writing the variables and the equals sign over and over again, merely as placeholders: all that is changing in the equations is the coefficient numbers. We can make our life easier by extracting only the numbers, and putting them in a box:
This is called an augmented matrix. The word “augmented” refers to the vertical line, which we draw to remind ourselves where the equals sign belongs; a matrix is a grid of numbers without the vertical line.
Augmented Matrices from Vector Equations
We already saw that systems of linear equations and vector equations are the same thing.
If you have a vector equation, such as the equation in (1.2.1),
then converting it to augmented matrix form is very easy. You can just read off directly that the augmented martrix is
That is to say, the columns of the augmented matrix are exactly the column-vectors appearing in the equation. The column of constants in the equation, on the right of the “” sign, becomes the augmentation column of the matrix.
Row Operations
In augmented matrix notation, our three valid ways of manipulating our equations become row operations:
Scaling: multiply all entries in a row by a nonzero number.
Here the notation simply means “the first row”, and likewise for etc.
Replacement: add a multiple of one row to another, replacing the second row with the result.
When we wrote our row operations above we used expressions like Of course this does not mean that the second row is equal to the second row minus twice the first row. Instead it means that we are replacing the second row with the second row minus twice the first row. This kind of syntax is used frequently in computer programming when we want to change the value of a variable.
In the previous subsection we saw how to translate a system of linear equations into an augmented matrix. We want to find an algorithm for “solving” such an augmented matrix. First we must decide what it means for an augmented matrix to be “solved”.
Definition
A matrix is in row echelon form if:
All zero rows are at the bottom.
The first nonzero entry of a row is to the right of the first nonzero entry of the row above.
Below the first nonzero entry of a row, all entries are zero.
Here is a picture of a matrix in row echelon form:
Definition
A pivot is the first nonzero entry of a row of a matrix in row echelon form.
A matrix in row-echelon form is generally easy to solve using back-substitution. For example,
We immediately see that which implies and See this example.
Definition
A matrix is in reduced row echelon form if it is in row echelon form, and in addition:
Each pivot is equal to 1.
Each pivot is the only nonzero entry in its column.
Here is a picture of a matrix in reduced row echelon form:
A matrix in reduced row echelon form is in some sense completely solved. For example,
When deciding if an augmented matrix is in (reduced) row echelon form, there is nothing special about the augmented column(s). Just ignore the vertical line.
If an augmented matrix is in reduced row echelon form, the corresponding linear system is viewed as solved. We will see below why this is the case, and we will show that any matrix can be put into reduced row echelon form using only row operations.
We can visualize this system as a pair of lines in (red and blue, respectively, in the picture below) that intersect at the point If we subtract the first equation from the second, we obtain the equation or This results in the system of equations:
In terms of row operations on matrices, we can write this as:
What has happened geometrically is that the original blue line has been replaced with the new blue line We can think of the blue line as rotating, or pivoting, around the solution We used the pivot position in the matrix in order to make the blue line pivot like this. This is one possible explanation for the terminology “pivot”.
Subsection2.2.3The Row Reduction Algorithm
Theorem
Every matrix is row equivalent to one and only one matrix in reduced row echelon form.
We will give an algorithm, called row reduction or Gaussian elimination, which demonstrates that every matrix is row equivalent to at least one matrix in reduced row echelon form.
The uniqueness statement is interesting—it means that, no matter how you row reduce, you always get the same matrix in reduced row echelon form.
This assumes, of course, that you only do the three legal row operations, and you don’t make any arithmetic errors.
We will not prove uniqueness, but maybe you can!
Algorithm(Row Reduction)
Step 1a: Swap the 1st row with a lower one so a leftmost nonzero entry is in the 1st row (if necessary).
Step 1b: Scale the 1st row so that its first nonzero entry is equal to 1.
Step 1c: Use row replacement so all entries below this 1 are 0.
Step 2a: Swap the 2nd row with a lower one so that the leftmost nonzero entry is in the 2nd row.
Step 2b: Scale the 2nd row so that its first nonzero entry is equal to 1.
Step 2c: Use row replacement so all entries below this 1 are 0.
Step 3a: Swap the 3rd row with a lower one so that the leftmost nonzero entry is in the 3rd row.
etc.
Last Step: Use row replacement to clear all entries above the pivots, starting with the last pivot.