Skip to main content

PLP: An introduction to mathematical proof

Section 12.1 Finite sets

Consider the following set of place names from Australia:
Think about what you do when you count the elements of that set. Now obviously 163  there are 6 elements in the set. However, think about how you count up those elements. Indeed, think about how we learn to count when we are very young. Typically we count by pointing at each element in turn (either physically pointing, or just pointing in our mind’s eye), and counting off “one, two, three,…”.
So what we are really doing here is constructing a function that takes us from the set of objects that we are counting to a subset \(\set{1,2,3,\dots,n}\text{.}\) This function is both
  • injective — since different objects will be counted by different numbers, and
  • surjective — since every number is used to count an object.
Hence when we count the objects in a finite set \(A\) we are really constructing a bijection from \(A\) to a subset of the natural numbers, \(\set{1,2,3,\dots,n } \text{.}\)
Now notice also that if we consider the following two sets 164 :
Then we see that there are the same number of elements in each. We can do this in two ways, one is to count up as we did in the previous figure, but we could also simply establish a bijection between them.
These are actually equivalent. If there is a bijection, \(f\) from the set of place names to the set \(\set{1,2,3,4,5,6}\text{,}\) and another \(g\) from the set of colours to the set \(\set{1,2,3,4,5,6} \text{,}\) then \(g^{-1}\) is a bijection, and the composition \(g^{-1} \circ f\) will be a bijection from the set of place names to the set of colours. In this way, two sets have the same size when there is a bijection between them.
In the above discussion we have tried to leverage the language of functions in order to compare the sizes of sets. Using functions allows us to extend the idea of “these sets are the same size” from finite sets to infinite sets. That is the main aim of this part of the text.

Subsection 12.1.1 Equinumerous sets, bijections and pigeons

Oof! There are a lot of ideas above, so let’s set them down slowly, carefully and rigorously.

Definition 12.1.1.

Two sets \(A,B\) are said to have the same cardinality (or same cardinal number), written \(|A| = |B|\text{,}\) if either \(A=B=\es\) or there is a bijection from \(A\) to \(B\text{.}\) If \(A\) and \(B\) have the same cardinality then we say they are equinumerous 165 . Finally, if \(A\) and \(B\) are not equinumerous, we write \(|A| \neq |B|\text{,}\) and this is equivalent to saying that there is no bijection between them.
This definition allows us to understand cardinality in terms of functions; this is critical for our understanding the size of infinite sets.
  • When we write \(|A|=n \in \mathbb{N}\text{,}\) we are really stating that there is a bijection
    \begin{gather*} f: A \to \set{1,2,3,\dots,n} \end{gather*}
    From our work on functions (see Theorem 10.6.8), we know that this also implies that there is a bijection back from \(\set{1,2,\dots,n}\) to \(A\) — just the inverse function \(f^{-1}\text{.}\)
  • And when \(A = \es\) we write \(|A|=0\text{.}\) This is a special case that we have to treat separately; we cannot define a function with empty domain.

Example 12.1.2. Finite non-equinumerous sets.

Say we have two finite sets \(A,B\text{,}\) so that they are not equal in size, \(|A| \neq |B|\text{.}\) Then there cannot be a bijection between them. We will shortly prove this in general, but suppose (for the sake of this discussion) that \(|A|=5\) and \(|B|=3\text{.}\) So we can write our sets as
\begin{align*} A &= \set{a_1,a_2,a_3,a_4,a_5} & B &= \set{b_1,b_2,b_3} \end{align*}
First consider trying to construct a function \(f: A \to B\text{.}\) It is easy to construct a surjection:
\begin{equation*} f(a_1)=b_1, f(a_2)=b_2, f(a_3)=f(a_4)=f(a_5)=b_3. \end{equation*}
However it is impossible to construct an injection. Once we have assigned \(f(a_1), f(a_2), f(a_3)\) to three distinct values in \(B\text{,}\) it is impossible to assign \(f(a_4)\) without repeating one of the values from \(B\) — making our function non-injective.
Now try construct a function \(g: B \to A\text{.}\) It is easy to construct an injection:
\begin{equation*} g(b_1)=a_1, g(b_2)=a_2, g(b_3)=a_3 \end{equation*}
However it is impossible to construct a surjection. Once we have assigned \(g(b_1), g(b_2), g(b_3)\text{,}\) there will still be two elements of \(A\) that have no preimage in \(B\text{.}\)
The reasoning in the previous example is formalised by the Pigeonhole Principle 166 .
We prove the contrapositive of the first statement, and then prove the second by contradiction.
  • Assume each box has at least one element in it, then there must be at least \(k\) objects. Hence the number of objects is at least as large as the number of boxes.
  • Assume, to the contrary, that every box contains at most \(\left\lceil \frac{n}{k} \right\rceil -1\) objects, then the total number of objects is
    \begin{align*} k \cdot \left( \left\lceil \frac{n}{k} \right\rceil -1 \right) &= k \left\lceil \frac{n}{k} \right\rceil - k \lt n, \end{align*}
    where we have used the fact that \(\lceil x \rceil -1 \lt x \leq \lceil x \rceil\text{.}\) This gives a contradiction. Hence at least one box contains at least \(\left\lceil \frac{n}{k} \right\rceil\) objects.
An immediate corollary is the following result for finite sets that formalises the last example.
We prove each in turn. Assume that \(A,B\) are finite.
  • Assume that \(|A| \gt |B|\text{.}\) Then when the images of elements of \(A\) are placed into \(B\) by the function, there must, by the pigeonhole principle, be at least one element of \(B\) which is the image of \(\left\lceil \frac{|A|}{|B|} \right\rceil \gt 1\) elements of \(A\text{.}\) Hence \(f\) is not injective.
  • Assume that \(|A| \lt |B|\text{.}\) Then when the images of elements of \(A\) are placed into \(B\) by the function, there must, by the pigeonhole principle, be at least one element of \(B\) which is not the image of any element of \(A\text{.}\) Hence \(f\) is not surjective.
We should notice the contrapositive of the results in the above corollary. Let \(A,B\) be finite sets and \(f: A \to B\) be a function, then
  • if \(f\) is an injection then \(|A| \leq |B|\text{,}\) and
  • if \(f\) is a surjection then \(|A| \geq |B|\text{.}\)
Notice that one can do much more with the pigeonhole principle than just put objects in boxes — a little trip to your favourite search engine will turn up many many examples. We’ll give a few examples, most quite standard, and one definitely not.

Example 12.1.5.

Fix \(n \in \mathbb{N}\) and let \(S = \set{1,2,\cdots,2n-1}\) and let \(A \subseteq S\) so that \(|A|=n+1\text{.}\) Prove that there are two elements of \(A\) whose sum is \(2n\text{.}\)
Solution.
We can write \(2n\) as the following sums of pairs from \(S\text{:}\)
\begin{equation*} 2n = (2n-1)+1 = (2n-2)+2 = \cdots = (n+1)+(n-1) \end{equation*}
Notice that the number \(n\) cannot be used in such a pair.
So split the set \(S\) up into 2 element subsets and \(\set{n}\text{:}\)
\begin{equation*} \set{1,2n-1}, \set{2,2n-2}, \dots, \set{n-1, n+1}, \set{n} \end{equation*}
There are \(n\) sets in this list. Consequently, when we choose \(n\) elements it is possible to choose one element from each of the above subsets, however when we choose one more, we must choose a second element from one of those two element subsets. The two elements from the two element-subset sum to \(2n\) as required.

Example 12.1.6.

Let \(S = \set{1,2,\cdots,20}\) and let \(A \subseteq S\) so that \(|A|=11\text{.}\) That is, \(A\) contains 11 distinct integers from between 1 and 20 (inclusive). Then there exist \(a,b \in A\) so that \(a \mid b\text{.}\)
Solution.
First notice that we can write any natural number as the product of an odd number and a power of 2.
\begin{equation*} \forall n \in \mathbb{N}, \exists k,\ell \in \mathbb{N} \st n=(2k+1)\cdot 2^\ell \end{equation*}
So we can take any \(n \in A\) keep dividing by 2 until you get an odd number. When you do so the resulting odd number must be in the set \(\set{1,3,5,\cdots, 19}\text{.}\) Since this set contains 10 distinct elements, there must be two numbers in \(A\) that result the same odd number in \(\set{1,3,5,\cdots,19}\text{.}\) We can write these numbers as
\begin{equation*} a = (2k+1)2^i \qquad \text{and} \qquad b = (2k+1)2^j \end{equation*}
where \(i \neq j\text{.}\) If \(i \gt j\) then \(b\mid a\) and if \(i \lt j\) then \(a \mid b\text{.}\) In either case there must be a pair of integers in \(A\) for which one divides the other.

Example 12.1.7. Spurious correlations.

An excellent illustration of the pigeonhole principle is given by the database of “spurious correlations” — see the fantastic website and book by Tyler Vigen 167 .
It goes roughly like this — consider how many simple (ie not-too wiggly) graphs you can draw whose horizontal axis is the last 100 years. There might be (say) a hundred such “simple” graphs. Now build a big database of any statistics you can think of — diary production in Quebec, deaths by lightning, number of twins born in a particular city, etc etc. There are thousands and thousands of such statistics.
Now — associate each of these statistics to the curve that best approximates it. Since there are only (say) a hundred such curves, and many thousands of statistics, there must — by the pigeonhole principle — be at least one curve which corresponds to many statistics. In practice, there are many statistics for each curve. This means that those statistics are correlated. In this way you can see that, say, the number of mathematics PhD’s awarded is highly correlated with the quantity of uranium stored at US power plants. Is there any causal link — nope 168 .
This can be a source of fun (well, mathematician fun), but it can also create problems. The idea of making use of “Big data” to solve problems is getting a lot attention. People even announcing that analysis of huge data sets will replace 169  the scientific method! However, given enough data, you will find correlations everywhere. It is just a matter of putting objects in boxes — the pigeonhole principle at work.

Subsection 12.1.2 Comparing with functions

In the previous section we started to link cardinality of sets \(A,B\) and the types of functions that can be constructed between them. For example, we saw that
  • If \(f:A \to B\) is an injection then \(|A| \leq |B|\text{,}\)
  • If \(g:A \to B\) is a surjection then \(|A| \geq |B|\text{,}\) and
  • If \(h:A \to B\) is a bijection then \(|A| = |B|\)
In this section we demonstrate that this way of comparing sizes of sets is well-defined. For example, we should have that equality of cardinality behaves just like equality of integers:
\begin{equation*} |A| = |A| \end{equation*}
and
\begin{equation*} (|A|=|B|) \implies (|B|=|A|) \end{equation*}
and
\begin{equation*} (|A|=|B|) \land (|B|=|C|) \implies (|A|=|C|) \end{equation*}
That is, we need to show that equinumerous is an equivalence relation.
  • The identity function on \(A\text{,}\) \(i_A:A \to A\) is a bijection and thus \(|A|=|A|\text{.}\)
  • Assume \(|A| = |B|\text{.}\) Then there is a bijection \(f: A \to B\text{.}\) Since \(f\) is bijective the inverse function \(f^{-1}: B \to A\) exists and is bijective. Thus there is a bijection from \(B\) to \(A\) and so \(|B| = |A|\text{.}\)
  • Assume \(|A| = |B|\) and \(|B| = |C|\text{,}\) so there are bijections \(f: A \to B\) and \(g: B \to C\text{.}\) Since the composition of bijections is bijective, it follows that \(h:A \to C\) defined by \(h = g \circ f\) is bijective. Thus \(|A| = |C|\text{.}\)
This result is very useful. In order to show that two sets have the same size we can show they are equinumerous with a third set.

Remark 12.1.9. Cardinality inequalities.

In addition to proving the above properties of equality of cardinalities, we should also prove analogous results for inequalities of cardinalities. In particular, we should also show that
\begin{equation*} |A| \leq |A| \end{equation*}
and
\begin{equation*} (|A| \leq |B|) \land (|B| \leq |A|) \implies (|A|=|B|) \end{equation*}
and
\begin{equation*} (|A| \leq |B|) \land ( |B| \leq |C| ) \implies ( |A| \leq |C| ) \end{equation*}
Notice that the first follows immediately since the identity function on \(A\) is an injection. The last one follows because the composition of injections is itself an injection (this is exactly Theorem 10.5.3). The middle one is the Cantor-Schröder-Bernstein theorem and its proof is quite involved — see Section 12.5. It is reasonably easy to prove for finite sets, however. Maybe we’ll set that as an exercise.

Subsection 12.1.3 Infinite sets are strange.

When we deal with finite sets everything above is pretty clear cut — we can put things into a bijection with \(\set{1,2,\dots,n}\) for some \(n \in \mathbb{N}\text{.}\) We did exactly this with our place-names and colours example at the start of this chapter. However, when we get to non-finite sets, things get much more interesting and weird.
Here is a very telling example. Consider the positive even numbers
\begin{gather*} E = \{2,4,6,8,\dots\} \end{gather*}
Now it is clear that \(E\) is a proper subset of \(\mathbb{N}\text{;}\) it only contains ever second number. So it should 170  be half the size. But we have to be careful; consider the following function
\begin{align*} f: \mathbb{N} \to E && f(n) = 2n \end{align*}
This function is a bijection from \(\mathbb{N}\) to \(E\text{:}\)
  • Injective because if \(n_1 \neq n_2\) then \(f(n_1) = n_1 \neq n_2 = f(n_2)\text{.}\)
  • Surjective because for any \(b \in E\text{,}\) we know that \(b=2k\) for \(k \in \mathbb{N}\text{,}\) and \(f(k) = 2k=b \text{.}\)
So the set of natural numbers is equinumerous with a proper subset of the natural numbers! No finite set can do this because of the pigeonhole principle, but an infinite set can. This is actually one way of defining when a set is infinite:

Definition 12.1.10.

We say that a set \(A\) is infinite when there is a bijection between \(A\) and a proper subset of \(A\text{.}\)
This definition was introduced by Dedekind in the late 19th century. Notice that this way of defining infinity does not rely on the natural numbers. When we first learn of the existence of infinity as children it is usually in the context of having learned to count
\begin{gather*} 1,2,3,4,\dots \end{gather*}
and we ask ourselves (or our teacher, or our parent 171 ) — what is the biggest number? And we realise that we can go on counting forever without stopping. This idea of infinity as being somehow intimately related to the natural numbers is hard to shake. But we will a little later in this chapter.
The apparent paradox 172  in Dedekind’s definition of infinite sets goes back at least as far as Galileo, and is often called Galileo’s paradox. In his “Two new sciences” Galileo imagines a conversation between two people, Simplicio and Salviati 173  . Simplicio brings up the bijection 174  between the natural numbers at the squares:
\begin{align*} f: \set{1,2,3,\dots} \mapsto \set{1,4,9,\dots} && f(n) = n^2 \end{align*}
and the apparent paradox of something larger being equinumerous with something smaller. After some discussion Salviati concludes that one cannot compare cardinalities of infinite sets. That restriction stayed until the 19th century work of Cantor and others.
Georg Cantor developed modern set theory (especially between about 1874 and 1884) — he developed the finer understanding of infinite sets we now have. Before him, there were finite sets, and a somewhat loosely defined idea of infinity. Cantor’s work on infinity was not well received by the maths community of the 19th century and a number of big names in mathematics very publically criticised him — “corrupter of youth” was one of many insults 175  . It is worth search-engining your way to a biography of him and the intersection between his work on infinity and many ideas in the philosophy of mathematics and theology. Not light stuff to be summarised here in a quick paragraph.
While it is generally a good idea to avoid this word, it is probably safe to use it here. Clearly it is obviously safe to leave a decision on the use of this word to the reader who may edit their copy of the text accordingly.
The same set of place names, and a set of “interesting” colour names.
or “numerically equivalent”, but “equinumerous” is a nicer word.
While this is intuitively obvious, mathematicians like to make sure that obvious things are actually true. And if that obvious thing is really useful, then it should get a good name. Dirichlet was the first to formalise this idea in the 19th century and called it (in German) the drawer-principle — Schubfachprinzip. Since Dirichlet’s father was a postmaster, it is perhaps an imperfect translation of his idea that led to “pigeonholes” in the sense of a small open drawer or shelf used to sorting or storing letters (slightly antiquated in these days of tweeting (pun!), email, and social media updates). This imperfect English translation has been imported into other languages, including back into German. No pigeons are involved.
tylervigen.com/spurious-correlations
We hope not.
This was a big headline article in Wired magazine in 2008.
“Should” is another one of those dangerous words like “clearly” and “obviously” and “just”. It should (ha!) set off alarm bells.
Perhaps repeatedly?
It is a veridical paradox. That is, a paradox that seems absurd but is actually true. Another good example, for the paradoxically inclined, is the Monte Hall problem. The interested reader can (and should) search-engine their way to that particular example.
Salviati is named after one of Galileo’s friends, while Simplicio is likely modelled on one of his philosophical opponents, or perhaps is intended to represent Galileo’s beliefs early in his lief. Both characters appear in Galileo’s work “Dialogue Concerning the Two Chief World Systems” which discusses the relative merits of the Copernican (heliocentric) and Ptolemaic (geocentric) models of the solar system. That book was subsequently banned by the Catholic Church for over 200 years.
He doesn’t call it a bijection — that term didn’t come into mathematics until the middle of the 20th century with the work of Bourbaki.
Henri Poincaré called his work on infinities a “grave disease” infecting the discipline of mathematics, while Leopold Kronecker described Cantor as a charlatan and a renegade as well as the famous epithet “corrupter of youth” and “I don’t know what predominates in Cantor’s theory — philosophy or theology, but I am sure that there is no mathematics there.” Kronecker is reputed to have said that “God created the natural numbers; all else is the work of man.” — making him (arguably) a proponent of mathematical finitism. It’s an interesting topic and well worth a quick trip to your favourite search-engine.