Section 9.5 Greatest divisors, Bézout and the Euclidean algorithm
In our (brief) discussion above of division in modular arithmetic we came across the problem of computing common divisors. You probably first came across the question of computing common divisors when trying to simplify fractions:
\begin{equation*}
\frac{6}{15} = \frac{3\times 2}{3 \times 5} = \frac{2}{5}
\end{equation*}
We can simplify this fraction because \(3\) is a divisor of both \(6\) and \(15\text{.}\) In this context we typically want to find the greatest common divisor of the numerator and denominator and then factor that out.
Definition 9.5.1.
Let \(a,b \in \mathbb{Z}\) be non-zero. Then
The greatest common divisor of \(a,b\text{,}\) denoted \(\gcd(a,b)\text{,}\) is the largest positive integer that divides both \(a\) and \(b\text{.}\)
If \(\gcd(a,b)=1\) then we say that \(a\) and \(b\) are coprime, or relatively prime.
The least common multiple of \(a,b\text{,}\) denoted \(\lcm(a,b)\text{,}\) is the smallest positive integer that is divisible by both \(a\) and \(b\text{.}\)
By symmetry, \(\gcd(a,b)=\gcd(b,a)\) and \(\lcm(a,b)=\lcm(b,a)\text{.}\)
Finally, “greatest common divisor” and “least common multiple” are frequently abbreviated to gcd and lcm.
When \(a,b\) are small numbers it is not too hard to compute the gcd by hand, typically by thinking about factors. However, there is a much better (and faster) way called Euclid’s algorithm. It is based on the following observation.
Lemma 9.5.4.
Let \(a,b \in \mathbb{N}\text{,}\) with \(a\geq b\)
\begin{align*}
\gcd(a,0) \amp= a \\
\gcd(a,b) \amp= \gcd(b, a-b).
\end{align*}
Finally, if \(a = q b + r\) for some \(q,r \in \mathbb{Z}\text{,}\) then
\begin{equation*}
\gcd(a,b) = gcd(b,r).
\end{equation*}
Proof of Lemma 9.5.4.
We prove each point in turn.
Since \(a=a \cdot 1\) and \(0 = a \cdot 0\text{,}\) it follows that \(a\) is a divisor of both \(a,0\text{.}\) No number larger than \(a\) can divide \(a\text{,}\) so \(a\) must be the greatest common divisor.
As noted above, it is sufficient to show that (say) \(\gcd(a,b) \geq \gcd(b,a-b)\) and \(\gcd(a,b) \leq \gcd(b,a-b)\) in order to prove that \(\gcd(a,b)=\gcd(b,a-b)\text{.}\)
-
We start by showing that \(\gcd(a,b) \leq \gcd(b,a-b)\text{.}\) To do this, we prove that \(\gcd(a,b)\) divides both \(b\) and \((a-b)\text{,}\) and hence is a common divisor of \(b\) and \((a-b)\text{.}\) Since it is a common divisor, it cannot be bigger than the largest common divisor (by definition).
Let \(d = \gcd(a,b)\text{.}\) Since \(d \mid a\) and \(d \mid
b\text{,}\) we can write \(a=kd, b=\ell d\) for some \(k,\ell
\in \mathbb{Z}\text{.}\) Thus \(b-a = d(k-\ell)\) and so it follows that \(d \mid (a-b)\text{.}\) Hence \(d\) is a divisor of \(b\) and \(a-b\text{,}\) and so it must be no bigger than the gcd of \(b, a-b\text{.}\)
\begin{equation*}
d \leq \gcd(b, a-b).
\end{equation*}
Now let us reuse this argument, but now starting from \(c =
\gcd(b,a-b)\text{.}\) Since \(c \mid b\) and \(c \mid
(b-a)\text{,}\) there are \(k,\ell \in \mathbb{Z}\) so that \(b=ck, (b-a)=c\ell\text{.}\) It follows that \(a = b-(b-a) =
c(k-\ell)\) and thus \(c \mid a\text{.}\) And because \(c
\mid a\) and \(c \mid b\) we have
\begin{equation*}
c \leq \gcd(a,b)
\end{equation*}
But now we have both
\begin{equation*}
\gcd(a,b) \leq \gcd(b,a-b) \qquad \text{and} \qquad
\gcd(b,a-b) \leq \gcd(a,b)
\end{equation*}
and hence \(\gcd(a,b)=\gcd(b,a-b)\text{.}\)
That \(\gcd(a,b)=\gcd(b,r)\) follows by a nearly identical argument. If \(d \mid a\) and \(d \mid b\text{,}\) then
\begin{equation*}
d \mid (a-qb) = r
\end{equation*}
Hence \(d\) divides both \(b\) and \(r\text{,}\) and hence
\begin{equation*}
d \leq \gcd(b,r).
\end{equation*}
And since \(c = \gcd(b,r)\) divides both \(b\) and \(r\text{,}\) it must also satisfy
\begin{equation*}
d \mid (qb+r) = a
\end{equation*}
Thus \(c\) divides both \(b\) and \(a\text{,}\) and so
\begin{equation*}
c \leq \gcd(a,b)
\end{equation*}
Since \(\gcd(a,b) \leq \gcd(b,r)\) and \(\gcd(b,r) \leq \gcd(a,b)\text{,}\) we must have that they are equal.
Now why is this useful? Well, say \(a \gt b\text{,}\) then we can compute
\begin{equation*}
\gcd(a,b) = \gcd(b,a-b)
\end{equation*}
and now the number \(a-b\) is smaller than \(a\text{.}\) If we keep repeating this, then the numbers in our gcd keep getting smaller and smaller until we have to compute a really simple gcd. This is even faster if we write \(a = qb+r\text{,}\) because then
\begin{equation*}
\gcd(a,b) = \gcd(b,r).
\end{equation*}
For example
\begin{align*}
\gcd(268, 120) \amp = \gcd(120,28)
\amp\text{since } 28 \amp=268-2\times120\\
\amp= \gcd(28,8) \amp \text{since } 8 \amp = 120 - 4 \times 28 \\
\amp = \gcd(8,4) \amp \text{since } 4 \amp = 28 - 3\times 8 \\
\amp = \gcd(4, 0) \amp \text{since } 0 \amp = 8 - 2\times 4 \\
\amp = 4
\end{align*}
This method of computing the gcd is known as the Euclidean algorithm.
Algorithm 9.5.6. The Euclidean algorithm.
Let \(a,b \in \mathbb{Z}\) with \(|a|\geq |b|\) and at least one of \(a,b\) non-zero. Then \(\gcd(a,b)\) can be computed using the following steps:
If \(b=0\) then the gcd is \(a\text{,}\) otherwise go on to (2)
Compute the remainder of \(a\) divided by \(b\text{,}\) that is \(a = qb + r\) with \(0 \leq r \lt |b|\text{.}\)
Go back to (1) with \((a,b)\) replaced by \((b,r)\text{.}\)
Proof.
Assume \(a,b \in \mathbb{Z}\) as stated.
If
\(b=0\) then we are done since
\(\gcd(a,0)=a\) by Lemma
Lemma 9.5.4.
So now we can assume that
\(b\neq 0\text{.}\) By
Fact 3.0.3 we know that
\begin{equation*}
a = qb+r \qquad \text{with } 0 \leq r \lt |b|.
\end{equation*}
By Lemma
Lemma 9.5.4 we know that
\(\gcd(a,b) = \gcd(a,r)\text{.}\) Hence computing
\(\gcd(b,r)\) is the same as computing
\(\gcd(a,b)\text{.}\)
Since \(0 \lt r \lt |b|\) it follows that the value of \(r\) will be strictly smaller in each iteration. Further, since \(r\) is an integer, it follows that \(r\) will become zero in a finite number of iterations.
Hence the algorithm described will terminate in a finite number of steps and give the gcd.
Let’s do another one:
\begin{align*}
\gcd(869, 442) \amp= \gcd(442, 427) \amp \text{since } 869=442+427\\
\amp=\gcd(427, 15) \amp \text{since } 442=427+15\\
\amp=\gcd(15, 7) \amp \text{since } 427 = 15\cdot 28 + 7 \\
\amp=\gcd(7,1) \amp \text{since } 15 = 2\cdot 7 + 1\\
\amp=\gcd(1,0) \amp \text{since } 7 = 7\cdot 1 + 0\\
\amp= 1
\end{align*}
Now notice that at each stage we take linear combinations of the arguments of the gcd to make new arguments.
\begin{align*}
427 \amp= 869-442 \\
15\amp=442-427 \\
7 \amp=427-28\cdot 15 \\
1 \amp= 15 - 2 \cdot 7
\end{align*}
If we substitute these equations into each other, then we can write our result as some linear combination of our starting arguments. Let \(a=869, b=442\text{,}\) then:
\begin{align*}
427 \amp = 869-442 = a-b\\
15 \amp = 442-427 = b - \underbrace{(a-b)}_{=427} = 2b-a \\
7 \amp = 427- 28 \cdot 15 = \underbrace{(a-b)}_{=427} - 28\underbrace{(2b-a)}_{=15} = 29a - 57b\\
1 \amp = 15-2\cdot 7 = \underbrace{(2b-a)}_{=15} - 2\underbrace{(29a-75b)}_{=7} = 116b - 59a
\end{align*}
That is
\begin{equation*}
\gcd(869,442) = 1 = 116\cdot 442 - 59\cdot 869.
\end{equation*}
That one can write the gcd in this way is quite general and is known as Bézout’s Lemma.
Lemma 9.5.7. Bézout’s Lemma.
Let \(a,b \in \mathbb{Z}\) so that at least one of \(a,b\) is non-zero. Then there exist \(x,y \in \mathbb{Z}\) so that
\begin{equation*}
\gcd(a,b) = ax + by.
\end{equation*}
Proof.
Essentially one follows the Euclidean algorithm backwards just as we have in the example above. However one can also prove this using an induction-like argument which is nearly the same thing.
Consider an execution of the Euclidean algorithm starting from \(a,b\text{.}\) It builds a finite sequence of equations of the form:
\begin{align*}
a \amp= q_1 b + r_1 \\
b \amp = q_2 r_1 + r_2 \\
r_1 \amp = q_3 r_2 + r_3 \\
r_2 \amp = q_4 r_3 + r_4 \\
\vdots \amp= \vdots \\
r_{n-1} \amp= q_{n+1} r_n + 0
\end{align*}
where the gcd will be \(r_n\text{.}\) We now prove that for each \(i\text{,}\) there exist \(x_i,y_i \in \mathbb{Z}\) so that \(r_i = ax_i + by_i\text{,}\) using an induction-style proof.
Notice that when \(i=1\) we have
\begin{equation*}
r_1 = a - q_1 b
\end{equation*}
so we have \(x_1=1, y_1 = -q_1\text{.}\) Similarly, when \(i=2\) we have
\begin{equation*}
r_2 = b - q_2 r_1 = b - q_2(a-q_1b) = (1 + q_1q_2) b - q_2 a
\end{equation*}
Hence \(x_2 =-q_2, y_2 = 1+q_1q_2\text{.}\)
Now for general \(i\) we have that
\begin{equation*}
r_{i+1} = r_{i-1} - q_{i+1} r_{i}
\end{equation*}
So if \(r_i = x_i a + y_i b\) and \(r_{i-1} = x_{i-1}a + y_{i-1}b\text{,}\) then
\begin{equation*}
r_{i+1} = (x_{i-1} a + y_{i-1} b) - q_{i+1}(x_{i}a + y_{i}b)
= a(x_{i-1} - q_{i+1}x_{i} ) + b(y_{i-1} - q_{i+1}y_{i}).
\end{equation*}
Hence
\begin{equation*}
x_{i+1} = x_{i-1} - q_{i+1} x_{i} \qquad
y_{i+1} = y_{i-1} - q_{i+1} y_{i}.
\end{equation*}
So using this recurrence and our starting values of \((x_1,y_1), (x_2, y_2)\) we can find integer \((x_3,y_3), (x_4, y_4)\) and so on up to \((x_n, y_n)\text{.}\) And hence we can write \(gcd(a,b) = x_n a + y_n b\) with \(x_n,y_n \in \mathbb{Z}\text{.}\)
Notice that if we set
\begin{equation*}
(x_0,y_0) = (0,1) \qquad \text{ and } (y_0, y_1)=(1,-q_1)
\end{equation*}
then our recurrence gives us
\begin{align*}
x_2 \amp = 0 - q_2 \cdot 1 = -q_2 \amp
y_2 \amp = 1 - q_2 \cdot(-q_1) = 1 + q_1q_2
\end{align*}
as required.
In the proof of the above lemma we give a construction of a recurrence that gives us the needed \(x,y\) to express \(\gcd(a,b) = ax + by\text{.}\) By combining that recurrence and the Euclidean algorithm we get the extended Euclidean algorithm.
Algorithm 9.5.8. The extended Euclidean algorithm.
Let \(a,b \in \mathbb{Z}\) with \(|a|\geq |b|\) and at least one of \(a,b\) non-zero. Then \(\gcd(a,b)\) can be computed using the following steps:
If \(b=0\) then the gcd is \(a\) else go on to (2)
Compute the remainder of \(a\) divided by \(b\text{,}\) that is \(a = qb + r\) with \(0 \leq r \lt |b|\text{.}\)
Go back to (1) with \((a,b)\) replaced by \((b,r)\text{.}\)
In the process of computing this we obtain a sequence of \(n\) quotients \(q_i\) and remainders \(r_i\text{.}\) Define sequences \(x_i,y_i\) by the inital values
\begin{equation*}
x_0=0, x_1=1 \qquad \text{and} \qquad y_0=0, y_1=-q_1
\end{equation*}
and then for \(k \geq 1\text{:}\)
\begin{gather*}
x_{k+1} = x_{k-1} - q_{k+1} x_{k}\\
y_{k+1} = y_{k-1} - q_{k+1} y_{k}
\end{gather*}
Then
\begin{equation*}
\gcd(a,b) = a x_n + b y_n.
\end{equation*}
Proof.
This follows quite directly from
Proof 9.5.7.1 of Bézout’s lemma above.
Returning to our example of \(\gcd(869, 442)\) above, we had
\begin{align*}
q_1\amp=1 \amp q_2 \amp=1 \amp q_3=\amp28 \amp q_4\amp=2
\end{align*}
which gives \(x_0=0, x_1=1\) and \(y_0=1, y_1=-1\)
\begin{align*}
x_2 \amp=0 - 1 = -1 \amp
x_3 \amp= 1 - 28 \cdot (-1) = 29 \amp
x_4 \amp= -1 - 2 \cdot 29 = - 59\\
y_2 \amp= 1 - 1\cdot(-1)= 2 \amp
y_3 \amp = -1 - 28 \cdot 2 = -57 \amp
y_4 \amp = 2 - 2\cdot(-57) = 116 \amp
\end{align*}
and we can verify that
\begin{equation*}
-59\cdot 869 + 116 \cdot 442 = -51271 + 51272 = 1
\end{equation*}
as required.
Bézout’s turns out to be a useful result because it allows one to express \(\gcd(a,b)\) as a simple equation in \(a,b\) which is then easier to manipulate. As an example of its utility, consider the following lemma about division which we attribute to Euclid. This feels like it should be “obvious”, but it turns out to be not so easy to prove without invoking unique prime-factorisations of integers — the Fundamental Theorem of Arithmetic. Unfortunately that result is harder to prove than this one. We’ll get around to proving the Fundamental Theorem of Arithmetic, but not just yet.
Lemma 9.5.9. Euclid’s lemma.
Let \(a,b,p \in \mathbb{Z}\) so that \(p\) is prime, and \(p \mid ab\text{.}\) If \(p \nmid b\) then \(p \mid a\text{.}\)
More generally let \(a,b,d \in \mathbb{Z}\) so that \(d \mid ab\text{.}\) If \(\gcd(d,b)=1\) then \(d \mid a\text{.}\)
Via Bézout.
Notice that the second statement implies the first statement, since by definition the only divisors of \(p\) are \(1,p\text{,}\) and so if \(p \nmid b\) then \(\gcd(p,b) = 1\text{.}\)
By assumption, we know that \(d \mid ab\text{,}\) so we can write \(ab = dk\) for some \(k \in \mathbb{Z}.\) Bézout’s lemma allows us to write an equation linking \(b,d\text{.}\) In particular, it guarantees that there are \(x,y \in \mathbb{Z}\) so that
\begin{equation*}
1 = dx + by
\end{equation*}
Multiply this expression by \(a\) to get
\begin{equation*}
a = adx + aby
\end{equation*}
But since \(ab = dk\text{,}\) we have
\begin{equation*}
a = d(ax + ky)
\end{equation*}
and so \(d \mid a\) as required.
Implicitly Bézout.
Here is an alternative proof — it does not use Bézout’s lemma explicitly, but some of the ideas are similar. Let \(p \mid ab\) and assume that \(p \nmid b\text{.}\) Then form the set
\begin{equation*}
S = \set{x \in \mathbb{N} \so p \mid xa}
\end{equation*}
We know that
\(S\) is non-empty since
\(b \in S\text{.}\) Further, we also know that
\(S \subseteq \mathbb{N}\) and so obeys the well-ordering principle that we saw back in Chapter
Chapter 7 on induction. It implies that
\(S\) must have a smallest element,
\(z\text{.}\) If we can show that
\(z=1\) we are done because we have shown that
\(p \mid a\text{.}\) We first that
\(z\) divides all elements in
\(S\text{.}\)
Let \(x \in S\text{,}\) then by Euclidean division we know that
\begin{equation*}
x = qz + r \qquad \text{with } 0 \leq r \lt z
\end{equation*}
But we know that \(p \mid xa\) and \(xa = aqz+ra\text{.}\) Rewrite this as
\begin{equation*}
ra = xa-azq
\end{equation*}
Now by definition of \(S\text{,}\) we know that \(p \mid xa\) and \(p \mid az\text{,}\) so \(p \mid ra\text{.}\) If \(r \neq 0\) we have a problem because now \(p \mid ra\) and \(r \lt z\) which contradicts \(z\) being minimal. Hence we must have \(r=0\) and so \(z \mid x\text{.}\)
How do we now use this to show that \(z=1\text{?}\) Since \(p \mid pa\text{,}\) we know that \(p \in S\text{.}\) Similarly, since \(p \mid ba\text{,}\) we know that \(b \in S\text{.}\) Hence \(z \mid p\) and \(z \mid b\text{.}\) The only divisors of \(p\) are \(1\) and \(p\) and since \(p \nmid b\text{,}\) we must have \(z=1\text{.}\) So finally, we know that \(p \mid a\) as required.
Closer to Euclid.
The proof we give here is closer to Euclid’s original proof. The key to that proof is the fact that if we have a fraction \(\frac{a}{b}\) and then we find the smallest fraction equivalent to that, call if \(\frac{x}{y}\) (ie where \(x\) is as small as possible), then there must be some \(n\) so that \(a=nx\) and \(b=ny\text{.}\) Now this feels quite obvious. Obvious until you have to prove it.
We follow the clever argument given
in this excellent article 118 by Barry Mazur. Since the fractions
\(\frac{a}{b}\) and
\(\frac{x}{y}\) are equal, we know that
\begin{equation*}
ay = bx.
\end{equation*}
If \(a=x\) then \(n=1\) and we are done. So choose \(n\) to be the largest integer so that \(a \gt nx\text{.}\) Indeed we must have \(a-nx \geq x\) (otherwise we would pick \(n\) to be larger). And since \(x = \frac{y}{b} a\)
\begin{equation*}
nx = \frac{ny}{b} a
\end{equation*}
and so we must also have that \(ny \gt b\text{.}\)
Now consider the fraction \(\ds \frac{a-nx}{b-ny}\text{.}\) One can quickly verify by cross-multiplication that
\begin{equation*}
\frac{a-nx}{b-ny} = \frac{x}{y}.
\end{equation*}
So we have a new fraction that is also equal to \(\frac{a}{b}\) and \({x}{y}\text{.}\) But now, since \(\frac{x}{y}\) was minimal, we must have \(a-nx \geq x\text{,}\) and at the same time, we choose \(n\) so that \(a-nx \leq x\text{.}\) Thus we must have \(a-nx = x\text{.}\) In other words
\begin{equation*}
a = x(n+1).
\end{equation*}
and so \(a\) is a multiple of \(x\text{.}\) Substituting this back into \(ay=bx\) shows that
\begin{equation*}
b = y(n+1)
\end{equation*}
also.
This result implies that if we have
\begin{equation*}
\frac{a}{b}=\frac{c}{d}=r
\end{equation*}
then there is a unique smallest fraction \(\frac{x}{y}=r\) so that
\begin{align*}
a \amp=kx \amp b\amp=ky \\
c \amp=nx \amp d\amp=ny
\end{align*}
for some integers
\(k,n\text{.}\) That is, the two fractions
\(\frac{a}{b}, \frac{c}{d}\) reduce to the same
119 fraction
\(\frac{x}{y}\text{.}\)
Now let us go back and use this to prove that if \(p \mid ab\) and \(p \nmid b\) then \(p \mid a\text{.}\) Since \(p \mid ab\text{,}\) we know that \(ab = pc\text{.}\) Hence we have
\begin{equation*}
\frac{a}{p} = \frac{c}{b}.
\end{equation*}
From the above, we know that there exist \(k,n,x,y\) so that
\begin{align*}
a \amp= kx \amp c\amp= nx \\
p \amp= ky \amp b\amp= ny
\end{align*}
But we know that \(p\) is prime, so either \((k,y)=(p,1)\) or \((k,y)=(1,p)\text{.}\) The first case implies that \(p \mid a\text{,}\) while the second implies that \(p \mid b\text{.}\) Since \(p \nmid b\) by assumption, we know that \(p \mid a\text{.}\) And we are done.
Here is another example of how Bézout’s lemma can be very useful to prove a very useful
120 result linking the gcd and lcm.
Lemma 9.5.11.
The gcd and lcm satisfy the equation
\begin{equation*}
\gcd(a,b) \cdot \lcm(a,b) = |a \cdot b|.
\end{equation*}
Proof.
Let \(d = \gcd(a,b)\) and let \(L = \frac{|ab|}{d}\text{.}\) We need to show that both
\(L\) is a common multiple of \(a,b\text{,}\) and
any other common multiple of \(a,b\) is at least as large as \(L\text{.}\)
We know that, by definition, \(d \mid a\) and \(d \mid b\text{.}\) Hence
\begin{equation*}
a = dk \qquad \text{ and } b = dn
\end{equation*}
which gives \(|ab| = |d^2 kn|\) and so
\begin{equation*}
L = \frac{|ab|}{d} = |dkn| = |a n| = |b k|.
\end{equation*}
Thus \(a \mid L\) and \(b \mid L\) as required.
Let \(M\) be a common multiple of \(a,b\text{.}\) It suffices to show that \(L \mid M\) since this means that \(|L| \leq |M|\text{.}\) Since \(M\) is a common multiple of \(a,b\) we know that
\begin{equation*}
M = a p = bq
\end{equation*}
for some integer \(p,q\text{.}\) Now since \(dL = ab\text{,}\) it is easier to compare \(dM\) and \(dL\text{,}\) and we do that via Bézout’s lemma. Recall from Bézout’s lemma that there are \(x,y \in \mathbb{Z}\) so that:
\begin{equation*}
d = ax + by
\end{equation*}
This means that
\begin{align*}
dM \amp = aMx + bMy\\
\amp = a(\underbrace{bq}_{=M}) x + b (\underbrace{ap}_{=M}) y\\
\amp = ab (qx + py)
\end{align*}
and thus \(dL \mid dM\text{.}\) Hence \(L \mid M\) and so \(|L| \leq |M|\text{.}\) This means that \(L\) is the least common multiple of \(a,b\text{.}\)
Many computer algebra systems define \(\gcd(0,0)=0\text{.}\)
web.nmsu.edu/~davidp/euclid.pdf
web.nmsu.edu/~davidp/fractions-monthly-final.pdf
people.math.harvard.edu/~mazur/preprints/Eva.pdf
people.math.harvard.edu/~mazur/preprints/Eva.pdf
This actually turns out to be a subtle point which is false in other contexts. In particular one can come up with perfectly well behaved sets of numbers, such as \(\mathbb{Z}[\sqrt{-5}] = \set{a+b\sqrt{-5} \so a,b \in \mathbb{Z}}\) where this result is no longer true. The interested reader should search-engine their way to discussions of “unique factorisation domains” and non-examples.
and obvious until you have to prove it!