Skip to main content

PLP: An introduction to mathematical proof

Section 12.4 Comparing cardinalities

Subsection 12.4.1 Extending to the infinite

So really all 183  we have been able to do so far is show that \(|A| = |B|\) and, after borrowing some cunning from Cantor, that \(|A| \neq |B|\text{.}\) Now we will try to understand and give meaning to \(|A| \leq |B|\) and \(|A| \lt |B|\text{.}\) As before, we will try to translate these statements whose meaning are quite simple for finite sets, into statements about functions. From there we can extend them to make sense of infinite sets. This, in turn, will enable us to prove that there is no biggest set and so there an infinite number of infinities!
Let us go back to finite sets for a moment. Consider the sets
\begin{align*} A = \set{a,b,c} && S =\set{x,y,z,w} \end{align*}
Of course we know \(|A| \lt |B|\text{.}\) How can we describe this in terms of functions.
  • Is there an injection from \(A\) to \(B\text{?}\) — Yes, for example:
    \begin{align*} f(a)&=x &f(b)&=y & f(c)&=z \end{align*}
  • Is there a surjection from \(A\) to \(B\text{?}\) — No — by the pigeonhole principle.
  • Since there is no surjection there cannot be a bijection.
In fact this was a corollary of the pigeonhole principle — Corollary 12.1.4.
In the same way that extended the equivalence between the existence of bijections and sets being equinumerous, from finite sets to infinite sets, we extend the above ideas from finite sets to all sets.

Definition 12.4.1.

Let \(A\) and \(B\) be sets.
  • We write \(|A| \leq |B|\) if there is an injection from \(A\) to \(B\text{.}\)
  • Further, we use \(|A| \lt |B|\) to mean that \(|A|\leq |B|\) and \(|A|\neq |B|\text{.}\)
  • That is, we write \(|A| \lt |B|\) and say A has a smaller cardinality than \(B\) if there is an injection from \(A\) to \(B\) but no bijection.
So with this definition we can now sensibly 184  state that
\begin{gather*} \aleph_0 = |\mathbb{N}| \lt |\mathbb{R}|=c \end{gather*}
There are two different infinities — the infinity of the integers, and the larger infinity of the reals.
This inequality prompts two questions.
  • Are there more 185  infinities?
  • Is there some infinity that lies between \(\alpha_0\) and \(c\text{?}\)
This first question we can answer and will do so shortly. The second question is due to Cantor and remains unanswered; it now stated as a conjecture called the “continuum hypothesis”.
This is generally believed to be true. It is still an active area of research and people have shown quite a bit about what might happen if this result is true and if it is false. In particular — it has been proved that one cannot disprove this using standard set theory (by Gödel 1931). It was then proved that one cannot prove it to be true either (by Cohen 1963).

Subsection 12.4.2 Cantor’s Theorem and infinite infinities

Back to the number of infinities. One relatively easy result shows that there is a set bigger than the reals. We do this by considering a set and its power set. This result, now known as Cantor’s theorem, has very interesting consequences.
For finite sets this result is quite easy. One can prove (using induction, for example) that if \(|A|=n\) then \(|\pow{A}|=2^n\text{,}\) and that \(n \lt 2^n\) for any integer \(n\text{.}\) For infinite sets things are much less obvious.
We start by showing that there is an injection from \(A\) to \(\pow{A}\text{.}\) An easy one is
\begin{align*} f:A \to \pow{A} && \text{defined by} && f(a) &= \set{a}\\ \end{align*}

another one is

\begin{align*} h:A \to \pow{A} \amp\amp \text{defined by } \amp\amp h(a)\amp = A-\set{a}. \end{align*}
We can then show there is no bijection by showing that no function from \(A\) to its power set can be surjective. We do this using a proof by contradiction.
Before we do that, let us study these two functions, \(f\) and \(h\text{,}\) a little more. Consider \(A=\set{1,2,3}\text{,}\) then we have
\begin{align*} f(1) \amp= \set{1} \amp f(2)\amp=\set{2} \amp f(3)\amp=\set{3} \\ h(1) \amp= \set{2,3} \amp h(2)\amp=\set{1,3} \amp h(3)\amp=\set{1,2} \end{align*}
Notice that \(f,h\) take elements of \(A\) and turn them into subsets of \(A\text{.}\)
Look a little more closely at \(f\) and you see that \(1 \in f(1)\text{,}\) \(2 \in f(2)\) and \(3 \in f(3)\text{.}\) That is
\begin{equation*} \forall a \in A, a \in f(a) \end{equation*}
Looking at \(h\) we see that \(1 \not\in h(1), 2 \not\in h(2)\) and \(3 \not\in h(3)\text{.}\) That is
\begin{equation*} \forall a \in A, a \not\in h(a) \end{equation*}
More generally, if we have some function, \(g\text{,}\) that takes us from \(A\) to its power set, then we can try to understand which elements map into their image and which do not. In particular, for which \(x \in A\) is \(x \in g(x)\text{,}\) and for which \(y \in A\) is \(y \not\in g(y)\text{.}\) Understanding those sets of elements will be key to the proof.
So, assume, to the contrary, that there is a surjection (call it \(g\)) from \(A\) to its power set. Hence the function \(g\) takes an element, \(x \in A\) and maps it to a subset of \(g(x) \subseteq A\text{.}\) As we noted above, it is possible that when we apply \(g\) to an element \(x\) we’ll get a subset of \(A\) that contains \(x\text{.}\) So now let us say that
  • an element \(x\) is “good” if \(x \in g(x) \text{,}\) and
  • an element \(x\) is “bad” if \(x \not\in g(x) \text{.}\)
So we can now define the set of all good elements, \(G\text{,}\) and the set of all bad elements, \(B\text{:}\)
\begin{align*} G &= \set{x \in A \so x \in g(x)}\\ B &= \set{x \in A \so x \not\in g(x)} \end{align*}
Now both of these are subsets of \(A\) and so \(G,B \in \pow{A}\text{.}\) Further we see that each element of \(A\) must be in exactly one of \(G,B\text{,}\) and thus
\begin{equation*} G \cap B = \es \qquad \text{ and } \qquad G \cup B = A \end{equation*}
As is often the case, the set of good things is not as interesting as the set of bad things. So though we have defined \(G\text{,}\) we won’t use it further. Instead concentrate on \(B\text{.}\) By assumption \(g\) is surjective, so there must be some element of \(A\) that maps to \(B\text{.}\) That is, there should be \(q \in A\) so that
\begin{align*} g(q) &= B \end{align*}
We get the contradiction 186  by examining whether or not \(q \in B\text{.}\)
  • If \(q \in B\text{,}\) then since \(g(q)=B\) we must have (by definition of \(G\)) that \(q \in G\text{.}\) However, this is a contradiction — \(q \in B\) and \(q \not\in B\text{.}\)
  • If \(q \not\in B\text{,}\) then since \(g(q)=B\) we must have (by definition of \(B\)) that \(q \in B\text{.}\) However, this is a contradiction — \(q \not\in B\) and \(q \in B\text{.}\)
In either case we get a contradiction, so no such surjection exists.
We split the proof into three steps.
  • We show that the result holds when \(A = \es\)
  • We show that \(|A| \leq |\pow{A}|\) by giving an injection from \(A\) to \(\pow{A}\text{.}\)
  • Finally, we show that there cannot be a surjection from \(A\) to \(\pow{A}\text{.}\)
Either \(A\) is empty or not. If \(A = \es\) then \(0=|A| \lt |\pow{A}|=1\text{.}\) So in what follows we can assume \(A \neq \es\text{.}\)
We now construct an injection \(f:A \to \pow{A}\text{.}\) Define
\begin{equation*} f(x) = \set{x} \qquad \text{for all } x \in A. \end{equation*}
To show that this function is injective, let \(x_1,x_2 \in A\) and assume \(f(x_1)=f(x_2)\text{.}\) Then \(\set{x_1} = \set{x_2}\) and thus \(x_1=x_2\text{.}\) So \(f\) is injective. Thus \(|A| \leq |\pow{A}|\text{.}\)
We prove there cannot be a bijection between \(A\) and \(\pow{A}\) by showing that there cannot be a surjection. We do this by contradiction. Assume, to the contrary, that there is a surjection \(g:A \to \pow{A}\text{.}\) We then partition \(A\) into two subsets
\begin{equation*} G = \set{x \in A \so x \in g(x)} \qquad \text{ and } \qquad B = \set{x \in A \so x \not\in g(x)}. \end{equation*}
Notice that \(B \subseteq A\) and so \(B \in \pow{A}\text{.}\)
Since \(g\) is, by assumption, a surjection, and \(B\) is an element of the codomain of \(g\text{,}\) there must be \(q \in A\) so that \(g(q)=B\text{.}\) Now either \(q \in B\) or \(q \not \in B\text{.}\)
  • If \(q \in B\text{,}\) then since \(g(q)=B\) we have that \(q \in g(q)\text{.}\) But then definition of \(B\) implies that \(q \not \in B\text{,}\) giving a contradiction.
  • Similarly, if \(q \not\in B\text{,}\) then since \(g(q)=B\) we have that \(q \not\in g(q)\text{.}\) But then definition of \(B\) implies that \(q \in B\text{,}\) again giving a contradiction.
In either case we get a contradiction, and so we must conclude that no surjection \(g\) can exist. Thus there is no surjection from \(A \to \pow{A}\text{,}\) and thus there is no bijection from \(A \to \pow{A}\text{.}\)
We can immediately apply Cantor’s theorem to the natural numbers to see that
\begin{equation*} |\mathbb{N}| \lt | \pow{\mathbb{N}}| \end{equation*}
so we have 2 infinities! But, of course, we can do it again to get another infinity:
\begin{equation*} |\mathbb{N}| \lt | \pow{\mathbb{N}}| \lt | \pow{\pow{\mathbb{N}}}| \end{equation*}
and again!
\begin{equation*} |\mathbb{N}| \lt | \pow{\mathbb{N}}| \lt | \pow{\pow{\mathbb{N}}}| \lt | \pow{\pow{\pow{\mathbb{N}}}}| \end{equation*}
And we can keep going and going to get the following corollary.
Starting with \(\mathbb{R}\text{,}\) we can form \(\pow{\mathbb{R}}\text{,}\) which is not equinumerous, by Cantor’s theorem. But then we can take the power set of that to obtain a yet larger infinite set \(\pow{\pow{\mathbb{R}}}\text{.}\) Continuing in this fashion gives an infinitely long sequence of infinite sets none of which are equinumerous.
So we have at least a denumerable number of infinites! In fact there are even more than that! But we’ll let you search-engine your way to discussions of that result.

Subsection 12.4.3 Congratulations are in order

At this point we invite the reader to take stock of what we — ie you! — have managed to do by following along the text. We started by looking at very basic ideas of sets, statements and truth-tables, and have now just proved something fundamental and highly-non-trivial about the nature of infinity! This is no small achievement!

Subsection 12.4.4 One more question

There is one little question left in this section. From Theorem 12.3.1 we know that
\begin{equation*} |\mathbb{N}| \lt |\mathbb{R}| \end{equation*}
and from Theorem 12.4.3 we know that
\begin{equation*} |\mathbb{N}| \lt |\pow{\mathbb{N}}| \end{equation*}
but we have not yet compared
\begin{equation*} |\pow{\mathbb{N}}| \overset{\text{?}}{=} | \mathbb{R} | \end{equation*}
One can show that these sets are equinumerous, but constructing an explicit bijection between them is quite difficult. Instead one can prove that
\begin{equation*} |\pow{\mathbb{N}}| \leq | \mathbb{R} | \qquad \text{ and } | \mathbb{R} | \leq |\pow{\mathbb{N}}| \end{equation*}
by finding injections between those sets. However in order to show that the existence of those injections implies the existence of a bijection, ie
\begin{equation*} (|A| \leq |B|) \land (|B| \leq |A|) \implies (|A|=|B|) \end{equation*}
we need the Cantor-Schröder-Bernstein theorem.
The proof of the Cantor-Schröder-Bernstein theorem is quite involved, so the proof is not typically covered in a first course in proof. That being said, the result itself is useful and we do make use of it in some of the exercises for this chapter. Accordingly we encourage the reader to read about the result, skip over the proof, and see how Cantor-Schröder-Bernstein is used to show that \(| \mathbb{R} | = |\pow{\mathbb{N}}|\text{.}\)
The interested reader is, of course, encouraged to read the proof — it is a nice piece of mathematics.
Mind you we did quite a lot with “just” this.
Well — perhaps “precisely” is a better word than “sensibly”.
Once you show that something isn’t unique, it is only reasonable to ask “Well - how many are there?”. This was no small part of the controversy that Cantor’s work generated. Some Christian theologians thought his results were challenging the notion that the Christian God is unique and absolute. You can certainly find much interesting reading on the implications of Cantor’s work.
This contradiction is reminiscent of Russell’s paradox (due to Russell in 1901 and also by Zermelo in 1899). Consider the set of all sets that do not contain themselves. That is \(R = \set{X \so X \not\in X}\text{.}\) One realises the paradox when you try to decide whether \(R \in R\) or not. If \(R \in R\text{,}\) then my definition of the set, it cannot be. While if \(R \not\in R\) then, by definition of the set, it must be. There is much of interest here for a reader armed with a good search-engine.