Skip to main content

Subsection 2.9.2 (Optional) — Derivation of the Chain Rule

First, let's review what our goal is. We have been given a function \(g(x)\text{,}\) that is differentiable at some point \(x=a\text{,}\) and another function \(f(u)\text{,}\) that is differentiable at the point \(u=b = g(a)\text{.}\) We have defined the composite function \(F(x) = f\big(g(x)\big)\) and we wish to show that

\begin{align*} F'(a) &= f'\big(g(a)\big) \cdot g'(a) \end{align*}

Before we can compute \(F'(a)\text{,}\) we need to set up some ground work, and in particular the definitions of our given derivatives:

\begin{align*} f'(b) &= \lim_{H \to 0} \frac{f(b+H)-f(b)}{H} & \text{and }&& g'(a) &= \lim_{h \to 0} \frac{g(a+h)-g(a)}{h}. \end{align*}

We are going to use similar manipulation tricks as we did back in the proofs of the arithmetic of derivatives in Section 2.5. Unfortunately, we have already used up the symbols “\(F\)” and “\(H\)”, so we are going to make use the Greek letters \(\gamma, \varphi\text{.}\)

As was the case in our derivation of the product rule it is convenient to introduce a couple of new functions. Set

\begin{align*} \varphi(H) &= \frac{f(b+H)-f(b)}{H} \end{align*}

Then we have

\begin{align*} \lim_{H \to 0} \varphi(H) &= f'(b) = f'\big(g(a)\big) & \text{since } b=g(a), \end{align*}

and we can also write (with a little juggling)

\begin{align*} f(b+H) &= f(b) + H \varphi(H) \end{align*}

Similarly set

\begin{align*} \gamma(h) &= \frac{g(a+h)-g(a)}{h} \end{align*}

which gives us

\begin{align*} \lim_{h \to 0} \gamma(h) &= g'(a) & \text{ and } && g(a+h) &= g(a) + h \gamma(h). \end{align*}

Now we can start computing

\begin{align*} F'(a) &= \lim_{h \to 0} \frac{F(a+h)-F(a)}{h}\\ &= \lim_{h \to 0} \frac{f\big(g(a+h)\big)-f\big(g(a)\big)}{h} \end{align*}

We know that \(g(a) = b\) and \(g(a+h) = g(a) + h \gamma(h))\text{,}\) so

\begin{align*} F'(a) &= \lim_{h \to 0} \frac{f\big(g(a) + h\gamma(h) \big)-f\big(g(a)\big)}{h}\\ &= \lim_{h \to 0} \frac{f(b + h\gamma(h) )-f(b)}{h} \end{align*}

Now for the sneaky bit. We can turn \(f(b + h\gamma(h) )\) into \(f(b+H)\) by setting

\begin{gather*} H = h\gamma(h) \end{gather*}

Now notice that as \(h \to 0\) we have

\begin{align*} \lim_{h \to 0} H &= \lim_{h \to 0} h \cdot \gamma(h)\\ &= \lim_{h \to 0} h \cdot \lim_{h \to 0} \gamma(h)\\ &= 0 \cdot g'(a) = 0 \end{align*}

So as \(h\to 0\) we also have \(H \to 0\text{.}\)

We now have

\begin{align*} F'(a) &= \lim_{h \to 0} \frac{f\big(b + H\big)-f(b)}{h}\\ &= \lim_{h \to 0} \underbrace{\frac{f\big(b + H\big)-f(b)}{H}}_{= \varphi(H) } \cdot \underbrace{\frac{H}{h}}_{ = \gamma(h)} & \text{if } H= h \gamma(h) \ne 0\\ &= \lim_{h \to 0}\big( \varphi(H) \cdot \gamma(h) \big)\\ &= \lim_{h \to 0} \varphi(H) \cdot \lim_{h \to 0} \gamma(h) & \text{since $H\to0$ as $h\to 0$}\\ &= \lim_{H \to 0} \varphi(H) \cdot \lim_{h \to 0} \gamma(h) &= f'(b) \cdot g'(a) \end{align*}

This is exactly the RHS of the chain rule. It is possible to have \(H=0\) in the second line above. But that possibility is easy to deal with:

  • If \(g'(a)\ne 0\text{,}\) then, since \(\lim_{h \to 0} \gamma(h) = g'(a)\text{,}\) \(H= h \gamma(h)\) cannot be \(0\) for small nonzero \(h\text{.}\) Technically, there is an \(h_0\gt 0\) such that \(H= h \gamma(h)\ne 0\) for all \(0 \lt |h| \lt h_0\text{.}\) In taking the limit \(h\to 0\text{,}\) above, we need only consider \(0 \lt |h| \lt h_0\) and so, in this case, the above computation is completely correct.
  • If \(g'(a)=0\text{,}\) the above computation is still fine provided we exclude all \(h\)'s for which \(H= h \gamma(h)\ne 0\text{.}\) When \(g'(a)=0\text{,}\) the right hand side, \(f'\big(g(a)\big) \cdot g'(a)\text{,}\) of the chain rule is \(0\text{.}\) So the above computation gives
    \begin{equation*} \lim_{\genfrac{}{}{0pt}{}{h \to 0}{\gamma(h)\ne 0}} \frac{f\big(b + H\big)-f(b)}{h} =f'\big(g(a)\big) \cdot g'(a) = 0 \end{equation*}
    On the other hand, when \(H=0\text{,}\) we have \(f\big(b + H\big)-f(b)=0\text{.}\) So
    \begin{equation*} \lim_{\genfrac{}{}{0pt}{}{h \to 0}{\gamma(h) = 0}} \frac{f\big(b + H\big)-f(b)}{h} =0 \end{equation*}
    too. That's all we need.