Skip to main content

PLP: An introduction to mathematical proof

Section 10.6 Inverse functions

We now have enough machinery to start working on inverse functions. Consider what an inverse function needs to do. If we start with \(a \in A\) and apply a function \(f: A \to B\) then we obtain some element \(b \in B\text{.}\) We want our inverse to “undo” this and take \(b\) back to \(a\text{.}\) We’ll start by defining functions that are nearly inverses and give a few examples.
But before we do that, we note that some of the material below is a bit technical. The reader who is interested in these details should continue on. The reader who wishes to just get to the main results should instead skip ahead to Definition 10.6.6 and Theorem 10.6.8

Definition 10.6.1.

Recall Definition 10.4.12, and let \(f:A \to B\) and \(g: B \to A\) be functions.
  • If \(g \circ f = i_A\) then we say that \(g\) is a left-inverse of \(f\text{.}\)
  • Similarly, if \(f \circ g = i_B\) then we say that \(g\) is a right-inverse of \(f\text{.}\)
So notice that if we start at some point \(a \in A\) and apply \(f\) to get \(b \in B\text{,}\) then a left-inverse \(g\) tells us how to get back from \(b\) to \(a\text{:}\)
\begin{equation*} g(b) = g(f(a)) = a \end{equation*}
On the other hand, if are trying to get to a particular point \(b \in B\) in the codomain using \(f\text{,}\) then the right-inverse tells you a possible starting point \(a \in A\text{:}\)
\begin{equation*} g(b) = a \qquad \text{ so that } \qquad f(a) = f(g(b)) = b \end{equation*}

Example 10.6.2.

Let \(A=\set{1,2}\) and \(B = \set{4,5,6}\) and define
\begin{align*} \amp f: A \to B \amp\amp f(1)=4, f(2)=5 \\ \amp g: B \to A \amp\amp g(4)=1, g(5)=2, g(6)=2 \end{align*}
as depicted below.
Then notice that \(g \circ f: A \to A\) satisfies
\begin{equation*} g(f(1))=g(4)=1 \qquad \text{ and } \qquad g(f(2))=g(5)=2. \end{equation*}
That is \(g \circ f=i_A\) and so is a left-inverse of \(f\text{.}\) At the same time, \(f \circ g: B \to B\) satisfies
\begin{equation*} f(g(4))=f(1)=4 \qquad f(g(5))=f(2)=5 \qquad \text{but} \qquad f(g(6))=f(2)=5 \end{equation*}
and so \(g\) is not a right-inverse of \(f\text{.}\)
By swapping the roles of \(f,g\) in the above we can construct a function that is a right-inverse but not a left-inverse. Consider the functions defined in the image below.
Then we see that
\begin{equation*} g(f(1))=g(4)=1 \qquad g(f(2))=g(5)=2 \qquad \text{but} \qquad g(f(3))=g(5)=2 \end{equation*}
and so \(g\) is not a left-inverse of \(f\text{.}\) And then
\begin{equation*} f(g(4))=f(1)=4 \qquad \text{and} \qquad f(g(5))=f(2)=5 \end{equation*}
making \(g\) a right-inverse of \(f\text{.}\)
Notice in the above example, that the function that has a left-inverse is injective, while the function with the right-inverse is surjective. This is not a coincidence as the following two lemmas prove.
We prove each implication in turn.
  • Assume that \(f\) has a left-inverse, \(g\text{.}\) Now let \(a_1, a_2 \in A\) so that \(f(a_1)=f(a_2)\text{.}\) Then \(g(f(a_1)) = g(f(a_2))\text{,}\) but since \(g\) is the left-inverse of \(f\text{,}\) we know that \(g(f(a_1))=a_1\) and \(g(f(a_2)) = a_2\text{.}\) Thus \(a_1=a_2\) and so \(f\) is injective.
  • Now let \(f\) be injective. We construct a left-inverse of \(f\) in two steps. First pick any \(\alpha \in A\text{.}\) Then consider the preimage of a given point \(b \in B\text{.}\) That preimage, \(f^{-1}(\set{b})\text{,}\) is empty or not.
    • If \(f^{-1}(\set{b}) = \es\text{,}\) then define \(g(b)=\alpha\)
    • Now assume that \(f^{-1}(\set{b}) \neq \es\text{.}\) Since \(f\) is injective, the preimage \(f^{-1}(\set{b})\) contains exactly 1 element. To see why, consider \(a,c\) both in the preimage. We must have \(f(c)=f(a)=b\) (since they are both in the preimage of \(b\)), but since \(f\) is an injection, we must have \(c=a\text{.}\) So define \(g(b)=a\) the unique element in the preimage.
    To summarise
    \begin{equation*} g(b) = \begin{cases} \alpha \amp \text{if preimage empty} \\ a \amp \text{otherwise take unique $a$ in the preimage} \end{cases}. \end{equation*}
    Now let \(a \in A\text{,}\) under \(f\) it maps to some \(b \in B\) with \(b=f(a)\text{.}\) Hence (as argued above) \(a\) is the unique element in the preimage of \(b\text{,}\) and so \(g(f(a))=g(b)=a\text{.}\) Thus \(g\) is a left-inverse of \(f\text{.}\)
We prove each implication in turn.
  • Assume that \(f\) has a right-inverse, \(g\text{.}\) Now let \(b \in B\) and set \(a = g(b)\text{.}\) Then \(f(a)=f(g(b))=b\) and so \(f\) is surjective.
  • Now let \(f\) be surjective and let \(b \in B\text{.}\) For the sake of this proof, let us denote the preimage of \(b\) as
    \begin{equation*} A_b = \set{a \in A \so f(a)=b}. \end{equation*}
    Since \(f\) is surjective, we know that \(A_b \neq \es\) for every \(b \in B\text{.}\) So now define \(g(b)\) to be any 140  element of \(A_b\text{.}\)
    Now let \(b \in B\text{,}\) then under \(g\) it maps to some \(a \in A\) so that (by construction) \(f(a)=b\text{.}\) Hence \(f(g(b)) = f(a) = b\) and thus \(g\) is a right-inverse as required.
These lemmas tell us that a function has both a left inverse and right inverse if and only if it is bijective. We can go further those one-sided inverses are actually the same function.
Let \(f,g,h\) be as stated. Then we know that
\begin{equation*} g \circ f = i_A \qquad \text{ and } \qquad f \circ h = i_B \end{equation*}
Now starting with \(g\) we can write:
\begin{align*} g \amp = g \circ i_B \\ \amp = g \circ (f \circ h) = (g \circ f) \circ h\\ \amp = i_A \circ h = h \end{align*}
and thus \(g=h\) as required.
The last part of the lemma follows by combining the previous two lemmas.
So this lemma tells us conditions under which a function will have both a left- and right-inverse, and that those one-sided inverses are actually the same function. A function that is a left- and right-inverse is a (usual) inverse.

Definition 10.6.6.

Let \(f:A \to B\) and \(g:B \to A\) be functions. If \(g \circ f=i_A\) and \(f \circ g = i_B\) then we say that \(g\) is an inverse of \(f\text{.}\)
Note that we will prove that the inverse is unique, and so we will be able to say that \(g\) is the inverse of \(f\) and denote it \(f^{-1}\text{.}\)
Also note that if a function is an inverse then it is also a left- and right-inverse.
This proof is very similar to the proof of Lemma 10.6.5. Let \(g:B \to A\) and \(h: B \to A\) both be inverses to the function \(f\text{.}\) Then
\begin{align*} g \amp = g \circ i_B \\ \amp = g\circ (f \circ h) = (g \circ f) \circ h \\ \amp = i_A \circ h = h \end{align*}
and so \(g=h\text{.}\) Thus the inverse is unique.
We can now state our main theorem about inverse functions.
We combine some of the lemmas above to prove this result.
  • Assume that \(f\) has an inverse. Then that inverse is both a left-inverse and a right-inverse. Lemma 10.6.5 then implies that \(f\) is both injective and surjective, and so is bijective.
    Now assume that \(f\) is bijective. Then Lemma 10.6.5 tells us there exists a function \(g\) that is a left-inverse and right-inverse for \(f\text{.}\) Then, by definition \(g\) is an inverse for \(f\text{.}\)
  • The uniqueness of the inverse is proven by Lemma 10.6.7.
Theorem 10.6.8 tells us under what circumstances a function has an inverse. However, it does not tell us if that inverse has a nice expression. If the original function is nice enough, then we may be able to state the inverse nicely. Here are a couple of such examples.

Example 10.6.9.

The function \(f: \mathbb{R} \to \mathbb{R}\) defined by \(f(x)=7x-3\) is bijective and so has an inverse function. We proved this in Result 10.4.2 and Result 10.4.2. The inverse is \(f^{-1}: \mathbb{R} \to \mathbb{R}\) and we can work out a formula for it by solving \(y=f(x)\) for \(x\) in terms of \(y\text{.}\) Notice that we did exactly that when we proved that \(f\) was surjective. In particular, we found that
\begin{equation*} f^{-1}: \mathbb{R} \to \mathbb{R} \qquad \text{defined by} \qquad f^{-1}(y)= \frac{y+3}{7}. \end{equation*}
To verify that this is correct it suffices to show that \(f^{-1}\circ f\) is the identity function:
\begin{align*} f^{-1} \circ f (x) \amp = f^{-1}(7x-3) \\ \amp = \dfrac{(7x-3)+3}{y}=x \end{align*}
as required.

Example 10.6.10. Möbius continued.

In Result 10.4.14 above we saw that a Möbius transformation \(h: \mathbb{R}-\set{\frac{d}{c}} \to \mathbb{R}-\set{\frac{a}{c}}\) defined by
\begin{equation*} h(x) = \frac{ax-b}{cx-d} \end{equation*}
is bijective. The above result tells us that it has an inverse function. We can verify that the inverse is \(h^{-1}: \mathbb{R}-\set{\frac{a}{c}} \to \mathbb{R}-\set{\frac{d}{c}}\) defined by
\begin{equation*} h^{-1}(y) = \frac{dy-b}{cy-a}. \end{equation*}
(which we computed while proving the result). We simply need to show that \(h^{-1} \circ h\) is the identity:
\begin{align*} h^{-1}(h(x))\amp = h^{-1}\left(\frac{ax-b}{cx-d} \right) \\ \amp = \dfrac{d \frac{ax-b}{cx-d} -b }{c \frac{ax-b}{cx-d} -a}\\ \amp = \dfrac{d(ax-b)-b(cx-d)}{c(ax-b)-a(cx-d)}\\ \amp = \dfrac{(ad-bc)x}{ad-bc} = x \end{align*}
as required.
That one can do this is not as obvious as it might seem. In particular, if \(B\) is infinite we need the Axiom of Choice in order to make this selection. The interested reader should search engine their way to more information on this. Now in the event that our function \(f\) is injective, then \(A_b\) contains exactly one element and we don’t need the Axiom of Choice to construct our function. Thankfully we apply this result when our function is bijective — phew.