Skip to main content

PLP: An introduction to mathematical proof

Section 9.5 Greatest divisors, Bézout and the Euclidean algorithm

In our (brief) discussion above of division in modular arithmetic we came across the problem of computing common divisors. You probably first came across the question of computing common divisors when trying to simplify fractions:
\begin{equation*} \frac{6}{15} = \frac{3\times 2}{3 \times 5} = \frac{2}{5} \end{equation*}
We can simplify this fraction because \(3\) is a divisor of both \(6\) and \(15\text{.}\) In this context we typically want to find the greatest common divisor of the numerator and denominator and then factor that out.

Definition 9.5.1.

Let \(a,b \in \mathbb{Z}\) be non-zero. Then
  • The greatest common divisor of \(a,b\text{,}\) denoted \(\gcd(a,b)\text{,}\) is the largest positive integer that divides both \(a\) and \(b\text{.}\)
  • If \(\gcd(a,b)=1\) then we say that \(a\) and \(b\) are coprime, or relatively prime.
  • The least common multiple of \(a,b\text{,}\) denoted \(\lcm(a,b)\text{,}\) is the smallest positive integer that is divisible by both \(a\) and \(b\text{.}\)
  • By symmetry, \(\gcd(a,b)=\gcd(b,a)\) and \(\lcm(a,b)=\lcm(b,a)\text{.}\)
  • Finally, “greatest common divisor” and “least common multiple” are frequently abbreviated to gcd and lcm.

Remark 9.5.2. What about zero?

Notice that in the above defintion if \(a,b\) are non-zero then the \(\gcd(a,b)\) and \(\lcm(a,b)\) are well-defined. We should note that since \(0 \times a = 0\text{,}\) the definition extends to
\begin{equation*} \gcd(a,0)=a. \end{equation*}
When both \(a,b\) are zero then there is no greatest common divisor since \(0 \times n = 0\) for any \(n\text{.}\) That being said, in some contexts 114  \(\gcd(0,0)\) is defined to be zero.
For non-zero \(a\text{,}\) the \(\lcm(a,0)\) is not well defined since we cannot divide by zero. However, in some contexts \(\lcm(a,0)\) is defined to be zero since the only common multiple of \(a\) and \(0\) is zero.
With all of that said, we will only consider \(\gcd(a,b)\) when at least one of \(a,b\) are non-zero, and only consider \(\lcm(a,b)\) when \(a,b\) are both non-zero.

Remark 9.5.3.

Notice that by definition if \(c\) divides both \(a,b\) then \(c \leq \gcd(a,b)\text{:}\)
\begin{equation*} (c \mid a) \land (c \mid b) \implies c \leq \gcd(a,b). \end{equation*}
Similarly, if \(a\) and \(b\) divide \(m\) then \(\lcm(a,b) \leq m\text{:}\)
\begin{equation*} (a \mid m) \land (b\mid m) \implies m \geq \lcm(a,b). \end{equation*}
When \(a,b\) are small numbers it is not too hard to compute the gcd by hand, typically by thinking about factors. However, there is a much better (and faster) way called Euclid’s algorithm. It is based on the following observation.

Remark 9.5.5. Antisymmetry helping equality.

To prove the remaining two points we take advantage of the fact that the relation “\(\leq\)” is antisymmetric. That is
\begin{equation*} (x \leq y) \land (y \leq x) \implies x=y \end{equation*}
This means that we can break down the proof of an equality into proofs of two (potentially easier) inequalities. We have already done this proving set equalities using
\begin{equation*} (A \subseteq B) \land (B \subseteq A) \implies A=B. \end{equation*}
We prove each point in turn.
  • Since \(a=a \cdot 1\) and \(0 = a \cdot 0\text{,}\) it follows that \(a\) is a divisor of both \(a,0\text{.}\) No number larger than \(a\) can divide \(a\text{,}\) so \(a\) must be the greatest common divisor.
As noted above, it is sufficient to show that (say) \(\gcd(a,b) \geq \gcd(b,a-b)\) and \(\gcd(a,b) \leq \gcd(b,a-b)\) in order to prove that \(\gcd(a,b)=\gcd(b,a-b)\text{.}\)
  • We start by showing that \(\gcd(a,b) \leq \gcd(b,a-b)\text{.}\) To do this, we prove that \(\gcd(a,b)\) divides both \(b\) and \((a-b)\text{,}\) and hence is a common divisor of \(b\) and \((a-b)\text{.}\) Since it is a common divisor, it cannot be bigger than the largest common divisor (by definition).
    Let \(d = \gcd(a,b)\text{.}\) Since \(d \mid a\) and \(d \mid b\text{,}\) we can write \(a=kd, b=\ell d\) for some \(k,\ell \in \mathbb{Z}\text{.}\) Thus \(b-a = d(k-\ell)\) and so it follows that \(d \mid (a-b)\text{.}\) Hence \(d\) is a divisor of \(b\) and \(a-b\text{,}\) and so it must be no bigger than the gcd of \(b, a-b\text{.}\)
    \begin{equation*} d \leq \gcd(b, a-b). \end{equation*}
    Now let us reuse this argument, but now starting from \(c = \gcd(b,a-b)\text{.}\) Since \(c \mid b\) and \(c \mid (b-a)\text{,}\) there are \(k,\ell \in \mathbb{Z}\) so that \(b=ck, (b-a)=c\ell\text{.}\) It follows that \(a = b-(b-a) = c(k-\ell)\) and thus \(c \mid a\text{.}\) And because \(c \mid a\) and \(c \mid b\) we have
    \begin{equation*} c \leq \gcd(a,b) \end{equation*}
    But now we have both
    \begin{equation*} \gcd(a,b) \leq \gcd(b,a-b) \qquad \text{and} \qquad \gcd(b,a-b) \leq \gcd(a,b) \end{equation*}
    and hence \(\gcd(a,b)=\gcd(b,a-b)\text{.}\)
  • That \(\gcd(a,b)=\gcd(b,r)\) follows by a nearly identical argument. If \(d \mid a\) and \(d \mid b\text{,}\) then
    \begin{equation*} d \mid (a-qb) = r \end{equation*}
    Hence \(d\) divides both \(b\) and \(r\text{,}\) and hence
    \begin{equation*} d \leq \gcd(b,r). \end{equation*}
    And since \(c = \gcd(b,r)\) divides both \(b\) and \(r\text{,}\) it must also satisfy
    \begin{equation*} d \mid (qb+r) = a \end{equation*}
    Thus \(c\) divides both \(b\) and \(a\text{,}\) and so
    \begin{equation*} c \leq \gcd(a,b) \end{equation*}
    Since \(\gcd(a,b) \leq \gcd(b,r)\) and \(\gcd(b,r) \leq \gcd(a,b)\text{,}\) we must have that they are equal.
Now why is this useful? Well, say \(a \gt b\text{,}\) then we can compute
\begin{equation*} \gcd(a,b) = \gcd(b,a-b) \end{equation*}
and now the number \(a-b\) is smaller than \(a\text{.}\) If we keep repeating this, then the numbers in our gcd keep getting smaller and smaller until we have to compute a really simple gcd. This is even faster if we write \(a = qb+r\text{,}\) because then
\begin{equation*} \gcd(a,b) = \gcd(b,r). \end{equation*}
For example
\begin{align*} \gcd(268, 120) \amp = \gcd(120,28) \amp\text{since } 28 \amp=268-2\times120\\ \amp= \gcd(28,8) \amp \text{since } 8 \amp = 120 - 4 \times 28 \\ \amp = \gcd(8,4) \amp \text{since } 4 \amp = 28 - 3\times 8 \\ \amp = \gcd(4, 0) \amp \text{since } 0 \amp = 8 - 2\times 4 \\ \amp = 4 \end{align*}
This method of computing the gcd is known as the Euclidean algorithm.
Assume \(a,b \in \mathbb{Z}\) as stated.
  • If \(b=0\) then we are done since \(\gcd(a,0)=a\) by Lemma Lemma 9.5.4.
  • So now we can assume that \(b\neq 0\text{.}\) By Fact 3.0.3 we know that
    \begin{equation*} a = qb+r \qquad \text{with } 0 \leq r \lt |b|. \end{equation*}
    By Lemma Lemma 9.5.4 we know that \(\gcd(a,b) = \gcd(a,r)\text{.}\) Hence computing \(\gcd(b,r)\) is the same as computing \(\gcd(a,b)\text{.}\)
  • Since \(0 \lt r \lt |b|\) it follows that the value of \(r\) will be strictly smaller in each iteration. Further, since \(r\) is an integer, it follows that \(r\) will become zero in a finite number of iterations.
Hence the algorithm described will terminate in a finite number of steps and give the gcd.
Let’s do another one:
\begin{align*} \gcd(869, 442) \amp= \gcd(442, 427) \amp \text{since } 869=442+427\\ \amp=\gcd(427, 15) \amp \text{since } 442=427+15\\ \amp=\gcd(15, 7) \amp \text{since } 427 = 15\cdot 28 + 7 \\ \amp=\gcd(7,1) \amp \text{since } 15 = 2\cdot 7 + 1\\ \amp=\gcd(1,0) \amp \text{since } 7 = 7\cdot 1 + 0\\ \amp= 1 \end{align*}
Now notice that at each stage we take linear combinations of the arguments of the gcd to make new arguments.
\begin{align*} 427 \amp= 869-442 \\ 15\amp=442-427 \\ 7 \amp=427-28\cdot 15 \\ 1 \amp= 15 - 2 \cdot 7 \end{align*}
If we substitute these equations into each other, then we can write our result as some linear combination of our starting arguments. Let \(a=869, b=442\text{,}\) then:
\begin{align*} 427 \amp = 869-442 = a-b\\ 15 \amp = 442-427 = b - \underbrace{(a-b)}_{=427} = 2b-a \\ 7 \amp = 427- 28 \cdot 15 = \underbrace{(a-b)}_{=427} - 28\underbrace{(2b-a)}_{=15} = 29a - 57b\\ 1 \amp = 15-2\cdot 7 = \underbrace{(2b-a)}_{=15} - 2\underbrace{(29a-75b)}_{=7} = 116b - 59a \end{align*}
That is
\begin{equation*} \gcd(869,442) = 1 = 116\cdot 442 - 59\cdot 869. \end{equation*}
That one can write the gcd in this way is quite general and is known as Bézout’s Lemma.
Essentially one follows the Euclidean algorithm backwards just as we have in the example above. However one can also prove this using an induction-like argument which is nearly the same thing.
Consider an execution of the Euclidean algorithm starting from \(a,b\text{.}\) It builds a finite sequence of equations of the form:
\begin{align*} a \amp= q_1 b + r_1 \\ b \amp = q_2 r_1 + r_2 \\ r_1 \amp = q_3 r_2 + r_3 \\ r_2 \amp = q_4 r_3 + r_4 \\ \vdots \amp= \vdots \\ r_{n-1} \amp= q_{n+1} r_n + 0 \end{align*}
where the gcd will be \(r_n\text{.}\) We now prove that for each \(i\text{,}\) there exist \(x_i,y_i \in \mathbb{Z}\) so that \(r_i = ax_i + by_i\text{,}\) using an induction-style proof.
Notice that when \(i=1\) we have
\begin{equation*} r_1 = a - q_1 b \end{equation*}
so we have \(x_1=1, y_1 = -q_1\text{.}\) Similarly, when \(i=2\) we have
\begin{equation*} r_2 = b - q_2 r_1 = b - q_2(a-q_1b) = (1 + q_1q_2) b - q_2 a \end{equation*}
Hence \(x_2 =-q_2, y_2 = 1+q_1q_2\text{.}\)
Now for general \(i\) we have that
\begin{equation*} r_{i+1} = r_{i-1} - q_{i+1} r_{i} \end{equation*}
So if \(r_i = x_i a + y_i b\) and \(r_{i-1} = x_{i-1}a + y_{i-1}b\text{,}\) then
\begin{equation*} r_{i+1} = (x_{i-1} a + y_{i-1} b) - q_{i+1}(x_{i}a + y_{i}b) = a(x_{i-1} - q_{i+1}x_{i} ) + b(y_{i-1} - q_{i+1}y_{i}). \end{equation*}
Hence
\begin{equation*} x_{i+1} = x_{i-1} - q_{i+1} x_{i} \qquad y_{i+1} = y_{i-1} - q_{i+1} y_{i}. \end{equation*}
So using this recurrence and our starting values of \((x_1,y_1), (x_2, y_2)\) we can find integer \((x_3,y_3), (x_4, y_4)\) and so on up to \((x_n, y_n)\text{.}\) And hence we can write \(gcd(a,b) = x_n a + y_n b\) with \(x_n,y_n \in \mathbb{Z}\text{.}\)
Notice that if we set
\begin{equation*} (x_0,y_0) = (0,1) \qquad \text{ and } (y_0, y_1)=(1,-q_1) \end{equation*}
then our recurrence gives us
\begin{align*} x_2 \amp = 0 - q_2 \cdot 1 = -q_2 \amp y_2 \amp = 1 - q_2 \cdot(-q_1) = 1 + q_1q_2 \end{align*}
as required.
In the proof of the above lemma we give a construction of a recurrence that gives us the needed \(x,y\) to express \(\gcd(a,b) = ax + by\text{.}\) By combining that recurrence and the Euclidean algorithm we get the extended Euclidean algorithm.
This follows quite directly from Proof 9.5.7.1 of Bézout’s lemma above.
Returning to our example of \(\gcd(869, 442)\) above, we had
\begin{align*} q_1\amp=1 \amp q_2 \amp=1 \amp q_3=\amp28 \amp q_4\amp=2 \end{align*}
which gives \(x_0=0, x_1=1\) and \(y_0=1, y_1=-1\)
\begin{align*} x_2 \amp=0 - 1 = -1 \amp x_3 \amp= 1 - 28 \cdot (-1) = 29 \amp x_4 \amp= -1 - 2 \cdot 29 = - 59\\ y_2 \amp= 1 - 1\cdot(-1)= 2 \amp y_3 \amp = -1 - 28 \cdot 2 = -57 \amp y_4 \amp = 2 - 2\cdot(-57) = 116 \amp \end{align*}
and we can verify that
\begin{equation*} -59\cdot 869 + 116 \cdot 442 = -51271 + 51272 = 1 \end{equation*}
as required.
Bézout’s turns out to be a useful result because it allows one to express \(\gcd(a,b)\) as a simple equation in \(a,b\) which is then easier to manipulate. As an example of its utility, consider the following lemma about division which we attribute to Euclid. This feels like it should be “obvious”, but it turns out to be not so easy to prove without invoking unique prime-factorisations of integers — the Fundamental Theorem of Arithmetic. Unfortunately that result is harder to prove than this one. We’ll get around to proving the Fundamental Theorem of Arithmetic, but not just yet.
Notice that the second statement implies the first statement, since by definition the only divisors of \(p\) are \(1,p\text{,}\) and so if \(p \nmid b\) then \(\gcd(p,b) = 1\text{.}\)
By assumption, we know that \(d \mid ab\text{,}\) so we can write \(ab = dk\) for some \(k \in \mathbb{Z}.\) Bézout’s lemma allows us to write an equation linking \(b,d\text{.}\) In particular, it guarantees that there are \(x,y \in \mathbb{Z}\) so that
\begin{equation*} 1 = dx + by \end{equation*}
Multiply this expression by \(a\) to get
\begin{equation*} a = adx + aby \end{equation*}
But since \(ab = dk\text{,}\) we have
\begin{equation*} a = d(ax + ky) \end{equation*}
and so \(d \mid a\) as required.
Here is an alternative proof — it does not use Bézout’s lemma explicitly, but some of the ideas are similar. Let \(p \mid ab\) and assume that \(p \nmid b\text{.}\) Then form the set
\begin{equation*} S = \set{x \in \mathbb{N} \so p \mid xa} \end{equation*}
We know that \(S\) is non-empty since \(b \in S\text{.}\) Further, we also know that \(S \subseteq \mathbb{N}\) and so obeys the well-ordering principle that we saw back in Chapter Chapter 7 on induction. It implies that \(S\) must have a smallest element, \(z\text{.}\) If we can show that \(z=1\) we are done because we have shown that \(p \mid a\text{.}\) We first that \(z\) divides all elements in \(S\text{.}\)
Let \(x \in S\text{,}\) then by Euclidean division we know that
\begin{equation*} x = qz + r \qquad \text{with } 0 \leq r \lt z \end{equation*}
But we know that \(p \mid xa\) and \(xa = aqz+ra\text{.}\) Rewrite this as
\begin{equation*} ra = xa-azq \end{equation*}
Now by definition of \(S\text{,}\) we know that \(p \mid xa\) and \(p \mid az\text{,}\) so \(p \mid ra\text{.}\) If \(r \neq 0\) we have a problem because now \(p \mid ra\) and \(r \lt z\) which contradicts \(z\) being minimal. Hence we must have \(r=0\) and so \(z \mid x\text{.}\)
How do we now use this to show that \(z=1\text{?}\) Since \(p \mid pa\text{,}\) we know that \(p \in S\text{.}\) Similarly, since \(p \mid ba\text{,}\) we know that \(b \in S\text{.}\) Hence \(z \mid p\) and \(z \mid b\text{.}\) The only divisors of \(p\) are \(1\) and \(p\) and since \(p \nmid b\text{,}\) we must have \(z=1\text{.}\) So finally, we know that \(p \mid a\) as required.

Remark 9.5.10. Euclid without Bézout.

Since Euclid lived around 300BC in ancient Greece and Egypt, and Bézout lived in 18th century France, it is pretty clear that there must be a way to prove the above without using Bézout’s lemma. The interested reader should look at this excellent discussion of Euclid’s proof 115  by David Pengelley and Fred Richman, and other good articles here 116  and here 117 .
The proof we give here is closer to Euclid’s original proof. The key to that proof is the fact that if we have a fraction \(\frac{a}{b}\) and then we find the smallest fraction equivalent to that, call if \(\frac{x}{y}\) (ie where \(x\) is as small as possible), then there must be some \(n\) so that \(a=nx\) and \(b=ny\text{.}\) Now this feels quite obvious. Obvious until you have to prove it.
We follow the clever argument given in this excellent article 118  by Barry Mazur. Since the fractions \(\frac{a}{b}\) and \(\frac{x}{y}\) are equal, we know that
\begin{equation*} ay = bx. \end{equation*}
If \(a=x\) then \(n=1\) and we are done. So choose \(n\) to be the largest integer so that \(a \gt nx\text{.}\) Indeed we must have \(a-nx \geq x\) (otherwise we would pick \(n\) to be larger). And since \(x = \frac{y}{b} a\)
\begin{equation*} nx = \frac{ny}{b} a \end{equation*}
and so we must also have that \(ny \gt b\text{.}\)
Now consider the fraction \(\ds \frac{a-nx}{b-ny}\text{.}\) One can quickly verify by cross-multiplication that
\begin{equation*} \frac{a-nx}{b-ny} = \frac{x}{y}. \end{equation*}
So we have a new fraction that is also equal to \(\frac{a}{b}\) and \({x}{y}\text{.}\) But now, since \(\frac{x}{y}\) was minimal, we must have \(a-nx \geq x\text{,}\) and at the same time, we choose \(n\) so that \(a-nx \leq x\text{.}\) Thus we must have \(a-nx = x\text{.}\) In other words
\begin{equation*} a = x(n+1). \end{equation*}
and so \(a\) is a multiple of \(x\text{.}\) Substituting this back into \(ay=bx\) shows that
\begin{equation*} b = y(n+1) \end{equation*}
also.
This result implies that if we have
\begin{equation*} \frac{a}{b}=\frac{c}{d}=r \end{equation*}
then there is a unique smallest fraction \(\frac{x}{y}=r\) so that
\begin{align*} a \amp=kx \amp b\amp=ky \\ c \amp=nx \amp d\amp=ny \end{align*}
for some integers \(k,n\text{.}\) That is, the two fractions \(\frac{a}{b}, \frac{c}{d}\) reduce to the same 119  fraction \(\frac{x}{y}\text{.}\)
Now let us go back and use this to prove that if \(p \mid ab\) and \(p \nmid b\) then \(p \mid a\text{.}\) Since \(p \mid ab\text{,}\) we know that \(ab = pc\text{.}\) Hence we have
\begin{equation*} \frac{a}{p} = \frac{c}{b}. \end{equation*}
From the above, we know that there exist \(k,n,x,y\) so that
\begin{align*} a \amp= kx \amp c\amp= nx \\ p \amp= ky \amp b\amp= ny \end{align*}
But we know that \(p\) is prime, so either \((k,y)=(p,1)\) or \((k,y)=(1,p)\text{.}\) The first case implies that \(p \mid a\text{,}\) while the second implies that \(p \mid b\text{.}\) Since \(p \nmid b\) by assumption, we know that \(p \mid a\text{.}\) And we are done.
Here is another example of how Bézout’s lemma can be very useful to prove a very useful 120  result linking the gcd and lcm.
Let \(d = \gcd(a,b)\) and let \(L = \frac{|ab|}{d}\text{.}\) We need to show that both
  • \(L\) is a common multiple of \(a,b\text{,}\) and
  • any other common multiple of \(a,b\) is at least as large as \(L\text{.}\)
We know that, by definition, \(d \mid a\) and \(d \mid b\text{.}\) Hence
\begin{equation*} a = dk \qquad \text{ and } b = dn \end{equation*}
which gives \(|ab| = |d^2 kn|\) and so
\begin{equation*} L = \frac{|ab|}{d} = |dkn| = |a n| = |b k|. \end{equation*}
Thus \(a \mid L\) and \(b \mid L\) as required.
Let \(M\) be a common multiple of \(a,b\text{.}\) It suffices to show that \(L \mid M\) since this means that \(|L| \leq |M|\text{.}\) Since \(M\) is a common multiple of \(a,b\) we know that
\begin{equation*} M = a p = bq \end{equation*}
for some integer \(p,q\text{.}\) Now since \(dL = ab\text{,}\) it is easier to compare \(dM\) and \(dL\text{,}\) and we do that via Bézout’s lemma. Recall from Bézout’s lemma that there are \(x,y \in \mathbb{Z}\) so that:
\begin{equation*} d = ax + by \end{equation*}
This means that
\begin{align*} dM \amp = aMx + bMy\\ \amp = a(\underbrace{bq}_{=M}) x + b (\underbrace{ap}_{=M}) y\\ \amp = ab (qx + py) \end{align*}
and thus \(dL \mid dM\text{.}\) Hence \(L \mid M\) and so \(|L| \leq |M|\text{.}\) This means that \(L\) is the least common multiple of \(a,b\text{.}\)
Many computer algebra systems define \(\gcd(0,0)=0\text{.}\)
web.nmsu.edu/~davidp/euclid.pdf
web.nmsu.edu/~davidp/fractions-monthly-final.pdf
people.math.harvard.edu/~mazur/preprints/Eva.pdf
people.math.harvard.edu/~mazur/preprints/Eva.pdf
This actually turns out to be a subtle point which is false in other contexts. In particular one can come up with perfectly well behaved sets of numbers, such as \(\mathbb{Z}[\sqrt{-5}] = \set{a+b\sqrt{-5} \so a,b \in \mathbb{Z}}\) where this result is no longer true. The interested reader should search-engine their way to discussions of “unique factorisation domains” and non-examples.
and obvious until you have to prove it!