Induction, once you get past some of the technicalities, is a very intuitive idea. This author likes to think of induction like climbing an infinite ladder and an induction argument is really just a proof that we can climb that ladder. The proof is in 3 parts:
If you can step onto the ladder, and
from the current step you can reach the next step, then
you can climb the ladder as high as you want.
As more concrete mathematical example, say we wish to prove
\begin{equation*}
\text{For all } n \in \mathbb{N}, n^2+5n-7 \text{ is odd}.
\end{equation*}
The dedicated reader may recall a very similar example in Chapter 5 and realise that we can do this using proof by cases, however, it is a good example for proof by “ladder idea”.
Step onto the ladder. When \(n=1\) the polynomial is \(1+5-7 = -1\) which is odd. So the assertion is true for the very smallest value of \(n\text{.}\)
Climb from one rung to the next. We wish to show that if the assertion is true for the current value of \(n=k\text{,}\) then it is also true for \(n=k+1\text{.}\) That is, if we are on the current step, we can move up to the next step. So, to be more precise, we want to show that
\begin{equation*}
(k^2+5k-7 \text{ is odd}) \implies ( (k+1)^2+5(k+1)-7 \text{ is odd})
\end{equation*}
So this is just a little direct proof hiding inside our argument.
Conclusion. So how do we put these facts together? The first fact guarantees that our result is true for \(n=1\text{.}\) Then the second fact tells us that since it is true for \(n=1\text{,}\) it is also true for \(n=2\text{.}\) But then since it is true for \(n=2\text{,}\) it true for \(n=3\) (by the same fact). Applying that fact again and again gives us \(n=4, n=5, n=6, \cdots\text{.}\)
Strictly speaking we have to be a bit more careful with that final “…”, but we’ll see how to do that below.
This proof method, mathematical induction, is formalised in the next theorem.
Theorem7.1.1.The principle of mathematical induction.
For every \(n \in \mathbb{N}\) let \(P(n)\) be a statement. If
\(P(1)\) is a true statement, and
the implication \(P(k) \implies P(k+1)\) is true for all \(k \in \mathbb{N}\)
then \(P(n)\) is true for all \(n \in \mathbb{N}\text{.}\)
We can’t quite give a complete proof of this theorem, though we can give almost everything that is needed. Call this a “proof sketch” — the core ideas are there, but we’ll skip over a technicality and also write it a little informally.
Start by assuming the hypotheses of the theorem are true. That is, we assume that
\(P(1)\) is true, and
the implication \(P(k) \implies P(k+1)\) is true for all natural numbers \(k\text{.}\)
Then the core idea of the proof is construct two sets
The “good” set:
\begin{equation*}
G = \set{n \in \mathbb{N} \mid P(n) \text{ is true} }
\end{equation*}
This is the set of all \(n\)-values for which our statement is true. By assumption, we know that \(1 \in G\text{,}\) and \(G \neq \es\text{.}\)
The “bad” set:
\begin{equation*}
B = \set{n \in \mathbb{N} \mid P(n) \text{ is false} }.
\end{equation*}
This is the set of all \(n\)-values for which our statement is false.
As is often the case, the “good” isn’t as interesting as the “bad”, and we’ll focus on the set \(B\text{.}\) It is either empty or non-empty.
If \(B\) is empty, then we are done, because then \(P(n)\) must be true for every single \(n\in \mathbb{N}\) as required.
As \(B\) is not empty, we can look for the smallest element in \(B\text{.}\) And since \(B\) is a non-empty set of natural numbers we can always find its smallest smallest element. However we do note that this is not possible for all sets — we’ll come back to this point shortly.
Let \(\ell\) be this smallest number in \(B\text{.}\)
From the hypothesis in the theorem, we know that \(\ell \gt 1\) (since \(P(1)\) is true).
By the way we constructed \(B\) and \(\ell\text{,}\) we know that \(P(n)\) is true for all \(1 \leq n \lt \ell\text{.}\)
Now we have a problem:
we know that \(P(\ell-1)\) is true, but
we also know that the implication \(P(\ell-1) \implies P(\ell)\) is true (since that is one of the hypotheses).
But if \(P(\ell-1)\) is true, and \(P(\ell)\) is false, then that implication is false.
This is a contradiction — our assumption is contradicted by \(P(\ell)\) being false.
The only way to resolve this situation is to conclude that no such \(\ell\) exists — that is, \(B\) is empty.
Hence we must have \(P(n)\) is true for all natural numbers \(n\)
There are two sneaky things going on in this proof. Firstly, we are doing a proof by contradiction by stealth; don’t worry too much about this yet as we’ll return to those proofs later in the text. Second, and more important in the short term, is that the proof relies on a delicate point
since \(B\) is just a non-empty set of natural numbers, we can always find the smallest number in that set.
We are using here something called the well ordering principle. This guarantees us that we can always find the smallest element in a set of natural numbers. This is very definitely false for other sets of numbers — if we build a set of integers, or rational numbers, or real numbers, then it might not have a minimum.
Example7.1.2.Sets without minima.
Consider the following sets
\begin{align*}
\set{2k \mid k \in \mathbb{Z} } &&
\set{\frac{1}{n} \mid n \in \mathbb{N}} &&
(0,1) = \set{x \in \mathbb{R} \mid 0 \lt x \lt 1}.
\end{align*}
None of these sets have minimum elements; you can always find another element that is smaller.
Note that this argument requires proof by contradiction and if the reader is confused by the details of this argument, we suggest that they wait until we have covered that topic and then return to this discussion. In fact, making this an example in that section is a good idea and the authors might do exactly that.
Consider the set \(\set{2k \mid k\in\mathbb{Z}}\text{.}\)
Either this set has a minimum or it does not.
If it does have a minimum, call that minimum \(\ell\text{.}\)
However, the number \(\ell-2\) is also in the set, and \(\ell-2 \lt \ell\text{.}\)
So \(\ell\) cannot be the minimum.
So the set cannot have a minimum.
Similarly, if \((0,1)\) has a minimum then call it \(q\text{.}\) But then since \(0 \lt q \lt 1\text{,}\) we know that \(0 \lt q/2 \lt q\text{.}\) So \(q/2\) is in the set and smaller than \(q\text{,}\) so \(q\) cannot be a minimum. Hence no such minimum exists. The same argument works for the set \(\set{1/n \mid n\in\mathbb{N}}\text{.}\)
Now that we are armed with a formal statement of mathematical induction, we should make use of it. The typical first example is the simple summation we saw in the introduction to this chapter. However (arguably) this can give the wrong idea of what induction is. To (hopefully) avoid this potential trap, let us do an example about divisibility.
Result7.1.3.
For every natural number \(n\text{,}\)\(3 | (4^{n}-1)\text{.}\)
So our statement is \(P(n)\) is “\(3 | (4^{n}-1)\)”. To prove this true for all naturals \(n\) using induction, we need to show
The statement \(P(1)\) is true, and
The implications \(P(k) \implies P(k+1)\) is true for all \(k \in \mathbb{N}\text{.}\)
In most cases (but definitely not all), showing that \(P(1)\) is true is easy, and proving the inductive step requires work.
Setting \(n=1\) in our statement gives
\begin{gather*}
3|(4^1-1)
\end{gather*}
which is true (since \(4-1=3\)).
Now the harder part — the inductive step: We have to prove that
for all \(k\in \mathbb{N}\text{.}\) That is, the inductive step requires us to prove an implication. As we have done many times before (and will do many times in the future), we assume the hypothesis is true and work our way to the conclusion. So we start by assuming that \(3|(4^k-1)\text{,}\) which means that
And since \(4\ell+1 \in \mathbb{Z}\) we have reached the conclusion we want.
Now that we know how to prove the base case (that was easy) and we know how to prove the inductive step (not too bad), we can put things together into a proof. Remember to tell the reader what we are doing.
By the principle of mathematical induction the statement is true for all naturals \(n\text{.}\)
Remark7.1.4.Make structure obvious to the reader.
Notice that we are making the structure of the proof very clear. We start by telling reader that we are using induction. We have then labelled the two conditions “base case” and “inductive hypothesis” or “inductive step”, so that it is very clear how the proof is being constructed. We then have a little summarising sentence at the end saying “by induction it is true!” It means that a reader who is familiar with induction can readily check off each part of the induction proof against what they know about induction as they read the proof. We make it easy for them to read and verify the proof. (This is an especially good idea when someone has to mark your work, not just read it!)
We can rewrite our above simple parity example with this same structure. It is good practice.
By mathematical induction the result holds for all \(n \in \mathbb{N}\text{.}\)
To avoid doing the very standard summation examples for a touch longer, we’ll do an inequality. This particular example is another classic induction problem 79 .
Result7.1.6.
Let \(n \in \mathbb{N}\) and let \(x\in\mathbb{R}\) with \(x \gt -1\text{.}\) Then
To prove this second implication, we do as we have done so many times, and will do many times in the future, we assume the hypothesis is true. So we assume \((x \gt -1)\text{.}\) From the first implication (and our assumption that \(x\gt -1\)) we know that \((1+x)^k \geq 1+kx\text{.}\) We now need to prove that \((1+x)^{k+1}\) is bigger than \(1+(k+1)x\text{.}\)
A good place to start is to take the inequality we know and try to make it look like the inequality we want.
But there is a problem: where have we used the assumption \(x \gt -1\text{?}\)
When you are doing mathematics and answering problems, you should keep your eyes open for this sort of thing. If you haven’t used one of the hypotheses in your proof then there is very likely to be an error somewhere!
Now — our work above is not actually wrong, but that is more by luck than by design 80 . Where we multiplied by \(1+x\) we didn’t examine whether quantity is positive or negative. When we multiply an inequality by something positive (say \(+2\)), then inequality keeps the same sign:
Thankfully since \(x \gt -1\text{,}\) we know that \(1+x \gt 0\) and so we are not multiplying by a negative number. We don’t have to go back and fix our scratch work, but we should make sure that we highlight this point when we write up the proof. We help our reader understand important subtle details by pointing them out.
Based case: when \(n=1\text{,}\) the statement is true since \((1+x)^1 = 1+1\cdot x\text{.}\)
Induction step. We assume that \(x \gt -1\) and that \((1+x)^k \geq 1+kx\text{.}\) Since \(x \gt -1\text{,}\) we know that \(1+x \gt 0\) and so multiplying through by \(1+x\) doesn’t change the sign of the inequality:
Like most proofs of implications, we start by assuming the hypothesis is true and try to work our way to the conclusion. In this case, it is pretty clear that we can get from the first statement to the second just by adding \((k+1)\) to both sides. Note that “adding \(k+1\) to both sides” is not the induction step. The induction step requires us to prove that \(P(k) \implies P(k+1)\text{,}\) and we prove that in this particular example by adding \(k+1\) to both sides. This is a very common error (and more than a little frustrating to mark 82 ).
This is a very wrong proof. Examples are not proof.
Surprisingly this polynomial does give you prime numbers for \(n=1,\cdots,40\text{.}\) This was first noticed by Euler 83 . However when \(n=41\) it is clearly not prime, since it gives \(41^2-41+41 = 41^2\text{.}\)
Example7.1.10.Everything you know about 2 is wrong?
For all \(n \in \mathbb{N}\text{,}\)\(n+1 \lt n\text{.}\)
We give the flawed proof and then explain what is wrong with it in the solutions below.
However this is a vacuous statement since the hypothesis is never true. This is the flaw in our proof — we did not check the base-case. When \(n=1\) the statement is false.
Since the base case fails, we cannot use mathematical induction. Both the base case and the inductive step must be true for an induction proof to work.
Example7.1.11.Pólya’s colourless horses.
This is George Pólya’s proof that “All horses have the same colour”.
Again, we’ll first give the flawed proof, and then show you where it goes wrong.
Base case: When there is only 1 horse, it has the same colour as itself. So the statement is true when there is exactly 1 horse.
Inductive step: We assume the statement is true when there are \(n=k\) horses. Now consider a set of \(n=k+1\) horses. Exclude one of those horses from the set to obtain a set of \(k\) horses. By assumption all those horses have the same colour. Put that excluded horse back into the set and remove another. Now the set again has \(k\) horses and so they all have the same colour. This means that all the \(k+1\) horses (including the two we temporarily removed) must have the same colour.
By induction the statement is true for all \(n \in \mathbb{N}\text{.}\)
So what is wrong here? There is a subtle error in the inductive step. Remember that the inductive step has to be true for all \(k \in \mathbb{N}\text{,}\) however it contains the tacit assumption that \(k\) is not too small. To see why, step through the argument carefully when \(k=1\text{.}\)
Assume that the statement is true when \(n=k=1\text{.}\) Then consider a set of \(n=k+1 = 2\) horses: call them Alice and Bob 84 . Now we remove one horse (Alice), and then every horse remaining in the set (which is just Bob) has the same colour. Now we put Alice back and take Bob out. So Alice has the same colour as every horse in the set — which is just Alice. At no point do we ever compare the colour of Alice or Bob with a third horse 85 . Consequently we cannot infer any relationship between the colours of equine Alice or Bob.
Like many of the authors “jokes” it is an antique and should be handled with care.
Of course the author actually designed the example to illustrate this luck. So perhaps it is designed luck?
A nag?
Be nice to your marker?
The interested reader should search-engine their way to “Euler’s lucky numbers”. There is actually some quite deep number theory lurking in side that polynomial.
Very traditional names for horses with side-interests in cryptography.