Subsection 2.9.2 (Optional) — Derivation of the Chain Rule
First, let's review what our goal is. We have been given a function \(g(x)\text{,}\) that is differentiable at some point \(x=a\text{,}\) and another function \(f(u)\text{,}\) that is differentiable at the point \(u=b = g(a)\text{.}\) We have defined the composite function \(F(x) = f\big(g(x)\big)\) and we wish to show that
Before we can compute \(F'(a)\text{,}\) we need to set up some ground work, and in particular the definitions of our given derivatives:
We are going to use similar manipulation tricks as we did back in the proofs of the arithmetic of derivatives in Section 2.5. Unfortunately, we have already used up the symbols “\(F\)” and “\(H\)”, so we are going to make use the Greek letters \(\gamma, \varphi\text{.}\)
As was the case in our derivation of the product rule it is convenient to introduce a couple of new functions. Set
Then we have
and we can also write (with a little juggling)
Similarly set
which gives us
Now we can start computing
We know that \(g(a) = b\) and \(g(a+h) = g(a) + h \gamma(h))\text{,}\) so
Now for the sneaky bit. We can turn \(f(b + h\gamma(h) )\) into \(f(b+H)\) by setting
Now notice that as \(h \to 0\) we have
So as \(h\to 0\) we also have \(H \to 0\text{.}\)
We now have
This is exactly the RHS of the chain rule. It is possible to have \(H=0\) in the second line above. But that possibility is easy to deal with:
- If \(g'(a)\ne 0\text{,}\) then, since \(\lim_{h \to 0} \gamma(h) = g'(a)\text{,}\) \(H= h \gamma(h)\) cannot be \(0\) for small nonzero \(h\text{.}\) Technically, there is an \(h_0\gt 0\) such that \(H= h \gamma(h)\ne 0\) for all \(0 \lt |h| \lt h_0\text{.}\) In taking the limit \(h\to 0\text{,}\) above, we need only consider \(0 \lt |h| \lt h_0\) and so, in this case, the above computation is completely correct.
- If \(g'(a)=0\text{,}\) the above computation is still fine provided we exclude all \(h\)'s for which \(H= h \gamma(h)\ne 0\text{.}\) When \(g'(a)=0\text{,}\) the right hand side, \(f'\big(g(a)\big) \cdot g'(a)\text{,}\) of the chain rule is \(0\text{.}\) So the above computation gives\begin{equation*} \lim_{\genfrac{}{}{0pt}{}{h \to 0}{\gamma(h)\ne 0}} \frac{f\big(b + H\big)-f(b)}{h} =f'\big(g(a)\big) \cdot g'(a) = 0 \end{equation*}On the other hand, when \(H=0\text{,}\) we have \(f\big(b + H\big)-f(b)=0\text{.}\) So\begin{equation*} \lim_{\genfrac{}{}{0pt}{}{h \to 0}{\gamma(h) = 0}} \frac{f\big(b + H\big)-f(b)}{h} =0 \end{equation*}too. That's all we need.