Section 12.1 Finite sets
Consider the following set of place names from Australia:
Think about what you do when you
count the elements of that set. Now obviously
163 there are 6 elements in the set. However, think about how you count up those elements. Indeed, think about how we learn to count when we are very young. Typically we count by pointing at each element in turn (either physically pointing, or just pointing in our mind’s eye), and counting off “one, two, three,…”.
So what we are really doing here is constructing a function that takes us from the set of objects that we are counting to a subset \(\set{1,2,3,\dots,n}\text{.}\) This function is both
injective — since different objects will be counted by different numbers, and
surjective — since every number is used to count an object.
Hence when we count the objects in a finite set \(A\) we are really constructing a bijection from \(A\) to a subset of the natural numbers, \(\set{1,2,3,\dots,n } \text{.}\)
Now notice also that if we consider the following two sets
164 :
Then we see that there are the same number of elements in each. We can do this in two ways, one is to count up as we did in the previous figure, but we could also simply establish a bijection between them.
These are actually equivalent. If there is a bijection, \(f\) from the set of place names to the set \(\set{1,2,3,4,5,6}\text{,}\) and another \(g\) from the set of colours to the set \(\set{1,2,3,4,5,6} \text{,}\) then \(g^{-1}\) is a bijection, and the composition \(g^{-1} \circ f\) will be a bijection from the set of place names to the set of colours. In this way, two sets have the same size when there is a bijection between them.
In the above discussion we have tried to leverage the language of functions in order to compare the sizes of sets. Using functions allows us to extend the idea of “these sets are the same size” from finite sets to infinite sets. That is the main aim of this part of the text.
Subsection 12.1.1 Equinumerous sets, bijections and pigeons
Oof! There are a lot of ideas above, so let’s set them down slowly, carefully and rigorously.
Definition 12.1.1.
Two sets
\(A,B\) are said to have the same
cardinality (or same
cardinal number), written
\(|A| = |B|\text{,}\) if either
\(A=B=\es\) or there is a bijection from
\(A\) to
\(B\text{.}\) If
\(A\) and
\(B\) have the same cardinality then we say they are
equinumerous 165 . Finally, if
\(A\) and
\(B\) are not equinumerous, we write
\(|A| \neq |B|\text{,}\) and this is equivalent to saying that there is no bijection between them.
This definition allows us to understand cardinality in terms of functions; this is critical for our understanding the size of infinite sets.
Example 12.1.2. Finite non-equinumerous sets.
Say we have two finite sets \(A,B\text{,}\) so that they are not equal in size, \(|A| \neq |B|\text{.}\) Then there cannot be a bijection between them. We will shortly prove this in general, but suppose (for the sake of this discussion) that \(|A|=5\) and \(|B|=3\text{.}\) So we can write our sets as
\begin{align*}
A &= \set{a_1,a_2,a_3,a_4,a_5} &
B &= \set{b_1,b_2,b_3}
\end{align*}
First consider trying to construct a function \(f: A \to B\text{.}\) It is easy to construct a surjection:
\begin{equation*}
f(a_1)=b_1, f(a_2)=b_2, f(a_3)=f(a_4)=f(a_5)=b_3.
\end{equation*}
However it is impossible to construct an injection. Once we have assigned \(f(a_1), f(a_2), f(a_3)\) to three distinct values in \(B\text{,}\) it is impossible to assign \(f(a_4)\) without repeating one of the values from \(B\) — making our function non-injective.
Now try construct a function \(g: B \to A\text{.}\) It is easy to construct an injection:
\begin{equation*}
g(b_1)=a_1, g(b_2)=a_2, g(b_3)=a_3
\end{equation*}
However it is impossible to construct a surjection. Once we have assigned \(g(b_1), g(b_2), g(b_3)\text{,}\) there will still be two elements of \(A\) that have no preimage in \(B\text{.}\)
The reasoning in the previous example is formalised by the
Pigeonhole Principle 166 .
Theorem 12.1.3. Pigeonhole principle — Dirichlet’s Schubfachprinzip.
If \(n \gt 0\) objects are placed in \(k \gt 0\) boxes then
if \(n \lt k\) then at least one box has zero objects in it, and
if \(n \gt k\) then at least one box has at least two objects in it.
In the second case, we can be more precise: at least one box contains at least \(\left\lceil \frac{n}{k} \right\rceil\) objects, where \(\lceil x \rceil\) is the ceiling of \(x\) and denotes the smallest integer larger or equal to \(x\text{.}\)
Proof.
We prove the contrapositive of the first statement, and then prove the second by contradiction.
An immediate corollary is the following result for finite sets that formalises the last example.
Corollary 12.1.4.
Let \(A,B\) be finite sets and let \(f:A \to B\) be a function. Then
Proof.
We prove each in turn. Assume that \(A,B\) are finite.
Assume that \(|A| \gt |B|\text{.}\) Then when the images of elements of \(A\) are placed into \(B\) by the function, there must, by the pigeonhole principle, be at least one element of \(B\) which is the image of \(\left\lceil \frac{|A|}{|B|} \right\rceil \gt 1\) elements of \(A\text{.}\) Hence \(f\) is not injective.
Assume that \(|A| \lt |B|\text{.}\) Then when the images of elements of \(A\) are placed into \(B\) by the function, there must, by the pigeonhole principle, be at least one element of \(B\) which is not the image of any element of \(A\text{.}\) Hence \(f\) is not surjective.
We should notice the contrapositive of the results in the above corollary. Let \(A,B\) be finite sets and \(f: A \to B\) be a function, then
if \(f\) is an injection then \(|A| \leq |B|\text{,}\) and
if \(f\) is a surjection then \(|A| \geq |B|\text{.}\)
Notice that one can do much more with the pigeonhole principle than just put objects in boxes — a little trip to your favourite search engine will turn up many many examples. We’ll give a few examples, most quite standard, and one definitely not.
Example 12.1.5.
Fix \(n \in \mathbb{N}\) and let \(S = \set{1,2,\cdots,2n-1}\) and let \(A \subseteq S\) so that \(|A|=n+1\text{.}\) Prove that there are two elements of \(A\) whose sum is \(2n\text{.}\)
Solution.
Proof.
We can write \(2n\) as the following sums of pairs from \(S\text{:}\)
\begin{equation*}
2n = (2n-1)+1 = (2n-2)+2 = \cdots = (n+1)+(n-1)
\end{equation*}
Notice that the number \(n\) cannot be used in such a pair.
So split the set \(S\) up into 2 element subsets and \(\set{n}\text{:}\)
\begin{equation*}
\set{1,2n-1}, \set{2,2n-2}, \dots, \set{n-1, n+1}, \set{n}
\end{equation*}
There are \(n\) sets in this list. Consequently, when we choose \(n\) elements it is possible to choose one element from each of the above subsets, however when we choose one more, we must choose a second element from one of those two element subsets. The two elements from the two element-subset sum to \(2n\) as required.
Example 12.1.6.
Let \(S = \set{1,2,\cdots,20}\) and let \(A \subseteq S\) so that \(|A|=11\text{.}\) That is, \(A\) contains 11 distinct integers from between 1 and 20 (inclusive). Then there exist \(a,b \in A\) so that \(a \mid b\text{.}\)
Solution.
Proof.
First notice that we can write any natural number as the product of an odd number and a power of 2.
\begin{equation*}
\forall n \in \mathbb{N}, \exists k,\ell \in \mathbb{N} \st n=(2k+1)\cdot 2^\ell
\end{equation*}
So we can take any \(n \in A\) keep dividing by 2 until you get an odd number. When you do so the resulting odd number must be in the set \(\set{1,3,5,\cdots, 19}\text{.}\) Since this set contains 10 distinct elements, there must be two numbers in \(A\) that result the same odd number in \(\set{1,3,5,\cdots,19}\text{.}\) We can write these numbers as
\begin{equation*}
a = (2k+1)2^i \qquad \text{and} \qquad b = (2k+1)2^j
\end{equation*}
where \(i \neq j\text{.}\) If \(i \gt j\) then \(b\mid a\) and if \(i \lt j\) then \(a \mid b\text{.}\) In either case there must be a pair of integers in \(A\) for which one divides the other.
Example 12.1.7. Spurious correlations.
It goes roughly like this — consider how many simple (ie not-too wiggly) graphs you can draw whose horizontal axis is the last 100 years. There might be (say) a hundred such “simple” graphs. Now build a big database of any statistics you can think of — diary production in Quebec, deaths by lightning, number of twins born in a particular city, etc etc. There are thousands and thousands of such statistics.
Now — associate each of these statistics to the curve that best approximates it. Since there are only (say) a hundred such curves, and many thousands of statistics, there must — by the pigeonhole principle — be at least one curve which corresponds to many statistics. In practice, there are many statistics for each curve. This means that those statistics are correlated. In this way you can see that, say, the number of mathematics PhD’s awarded is highly correlated with the quantity of uranium stored at US power plants. Is there any causal link — nope
168 .
This can be a source of fun (well, mathematician fun), but it can also create problems. The idea of making use of “Big data” to solve problems is getting a lot attention. People even announcing that analysis of huge data sets will replace
169 the scientific method! However, given enough data, you will find correlations everywhere. It is just a matter of putting objects in boxes — the pigeonhole principle at work.
Subsection 12.1.2 Comparing with functions
In the previous section we started to link cardinality of sets \(A,B\) and the types of functions that can be constructed between them. For example, we saw that
If \(f:A \to B\) is an injection then \(|A| \leq |B|\text{,}\)
If \(g:A \to B\) is a surjection then \(|A| \geq |B|\text{,}\) and
If \(h:A \to B\) is a bijection then \(|A| = |B|\)
In this section we demonstrate that this way of comparing sizes of sets is well-defined. For example, we should have that equality of cardinality behaves just like equality of integers:
\begin{equation*}
|A| = |A|
\end{equation*}
and
\begin{equation*}
(|A|=|B|) \implies (|B|=|A|)
\end{equation*}
and
\begin{equation*}
(|A|=|B|) \land (|B|=|C|) \implies (|A|=|C|)
\end{equation*}
That is, we need to show that equinumerous is an equivalence relation.
Theorem 12.1.8.
Let \(A\text{,}\) \(B\) and \(C\) be sets. Then
\(|A| = |A|\) (reflexive).
If \(|A| = |B|\) then \(|B| = |A|\) (symmetric).
If \(|A| = |B|\) and \(|B| = |C|\) then \(|A| = |C|\) (transitive).
Proof.
The identity function on \(A\text{,}\) \(i_A:A \to A\) is a bijection and thus \(|A|=|A|\text{.}\)
Assume \(|A| = |B|\text{.}\) Then there is a bijection \(f: A \to B\text{.}\) Since \(f\) is bijective the inverse function \(f^{-1}: B \to A\) exists and is bijective. Thus there is a bijection from \(B\) to \(A\) and so \(|B| = |A|\text{.}\)
Assume \(|A| = |B|\) and \(|B| = |C|\text{,}\) so there are bijections \(f: A \to B\) and \(g: B \to C\text{.}\) Since the composition of bijections is bijective, it follows that \(h:A \to C\) defined by \(h = g \circ f\) is bijective. Thus \(|A| = |C|\text{.}\)
This result is very useful. In order to show that two sets have the same size we can show they are equinumerous with a third set.
Subsection 12.1.3 Infinite sets are strange.
When we deal with finite sets everything above is pretty clear cut — we can put things into a bijection with \(\set{1,2,\dots,n}\) for some \(n \in \mathbb{N}\text{.}\) We did exactly this with our place-names and colours example at the start of this chapter. However, when we get to non-finite sets, things get much more interesting and weird.
Here is a very telling example. Consider the positive even numbers
\begin{gather*}
E = \{2,4,6,8,\dots\}
\end{gather*}
Now it is clear that
\(E\) is a proper subset of
\(\mathbb{N}\text{;}\) it only contains ever second number. So it should
170 be half the size. But we have to be careful; consider the following function
\begin{align*}
f: \mathbb{N} \to E && f(n) = 2n
\end{align*}
This function is a bijection from \(\mathbb{N}\) to \(E\text{:}\)
Injective because if \(n_1 \neq n_2\) then \(f(n_1) = n_1 \neq n_2 = f(n_2)\text{.}\)
Surjective because for any \(b \in E\text{,}\) we know that \(b=2k\) for \(k \in \mathbb{N}\text{,}\) and \(f(k) = 2k=b \text{.}\)
So the set of natural numbers is equinumerous with a proper subset of the natural numbers! No finite set can do this because of the pigeonhole principle, but an infinite set can. This is actually one way of defining when a set is infinite:
Definition 12.1.10.
We say that a set \(A\) is infinite when there is a bijection between \(A\) and a proper subset of \(A\text{.}\)
This definition was introduced by Dedekind in the late 19th century. Notice that this way of defining infinity does not rely on the natural numbers. When we first learn of the existence of infinity as children it is usually in the context of having learned to count
\begin{gather*}
1,2,3,4,\dots
\end{gather*}
and we ask ourselves (or our teacher, or our parent
171 ) — what is the biggest number? And we realise that we can go on counting forever without stopping. This idea of infinity as being somehow intimately related to the natural numbers is hard to shake. But we will a little later in this chapter.
The apparent paradox
172 in Dedekind’s definition of infinite sets goes back at least as far as Galileo, and is often called Galileo’s paradox. In his “Two new sciences” Galileo imagines a conversation between two people, Simplicio and Salviati
173 . Simplicio brings up the bijection
174 between the natural numbers at the squares:
\begin{align*}
f: \set{1,2,3,\dots} \mapsto \set{1,4,9,\dots} && f(n) = n^2
\end{align*}
and the apparent paradox of something larger being equinumerous with something smaller. After some discussion Salviati concludes that one cannot compare cardinalities of infinite sets. That restriction stayed until the 19th century work of Cantor and others.
Georg Cantor developed modern set theory (especially between about 1874 and 1884) — he developed the finer understanding of infinite sets we now have. Before him, there were finite sets, and a somewhat loosely defined idea of infinity. Cantor’s work on infinity was not well received by the maths community of the 19th century and a number of big names in mathematics very publically criticised him — “corrupter of youth” was one of many insults
175 . It is worth search-engining your way to a biography of him and the intersection between his work on infinity and many ideas in the philosophy of mathematics and theology. Not light stuff to be summarised here in a quick paragraph.
While it is generally a good idea to avoid this word, it is probably safe to use it here. Clearly it is obviously safe to leave a decision on the use of this word to the reader who may edit their copy of the text accordingly.
The same set of place names, and a set of “interesting” colour names.
or “numerically equivalent”, but “equinumerous” is a nicer word.
While this is intuitively obvious, mathematicians like to make sure that obvious things are actually true. And if that obvious thing is really useful, then it should get a good name. Dirichlet was the first to formalise this idea in the 19th century and called it (in German) the drawer-principle — Schubfachprinzip. Since Dirichlet’s father was a postmaster, it is perhaps an imperfect translation of his idea that led to “pigeonholes” in the sense of a small open drawer or shelf used to sorting or storing letters (slightly antiquated in these days of tweeting (pun!), email, and social media updates). This imperfect English translation has been imported into other languages, including back into German. No pigeons are involved.
tylervigen.com/spurious-correlations
This was a big headline article in Wired magazine in 2008.
“Should” is another one of those dangerous words like “clearly” and “obviously” and “just”. It should (ha!) set off alarm bells.
It is a veridical paradox. That is, a paradox that seems absurd but is actually true. Another good example, for the paradoxically inclined, is the Monte Hall problem. The interested reader can (and should) search-engine their way to that particular example.
Salviati is named after one of Galileo’s friends, while Simplicio is likely modelled on one of his philosophical opponents, or perhaps is intended to represent Galileo’s beliefs early in his lief. Both characters appear in Galileo’s work “Dialogue Concerning the Two Chief World Systems” which discusses the relative merits of the Copernican (heliocentric) and Ptolemaic (geocentric) models of the solar system. That book was subsequently banned by the Catholic Church for over 200 years.
He doesn’t call it a bijection — that term didn’t come into mathematics until the middle of the 20th century with the work of Bourbaki.
Henri Poincaré called his work on infinities a “grave disease” infecting the discipline of mathematics, while Leopold Kronecker described Cantor as a charlatan and a renegade as well as the famous epithet “corrupter of youth” and “I don’t know what predominates in Cantor’s theory — philosophy or theology, but I am sure that there is no mathematics there.” Kronecker is reputed to have said that “God created the natural numbers; all else is the work of man.” — making him (arguably) a proponent of mathematical finitism. It’s an interesting topic and well worth a quick trip to your favourite search-engine.