We are now ready to define derivatives of functions of more than one variable. First, recall how we defined the derivative, \(f'(a)\text{,}\) of a function of one variable, \(f(x)\text{.}\) We imagined that we were walking along the \(x\)-axis, in the positive direction, measuring, for example, the temperature along the way. We denoted by \(f(x)\) the temperature at \(x\text{.}\) The instantaneous rate of change of temperature that we observed as we passed through \(x=a\) was
Next suppose that we are walking in the \(xy\)-plane and that the temperature at \((x,y)\) is \(f(x,y)\text{.}\) We can pass through the point \((x,y)=(a,b)\) moving in many different directions, and we cannot expect the measured rate of change of temperature if we walk parallel to the \(x\)-axis, in the direction of increasing \(x\text{,}\) to be the same as the measured rate of change of temperature if we walk parallel to the \(y\)-axis in the direction of increasing \(y\text{.}\) We'll start by considering just those two directions. We'll consider other directions (like walking parallel to the line \(y=x\)) later.
Suppose that we are passing through the point \((x,y)=(a,b)\) and that we are walking parallel to the \(x\)-axis (in the positive direction). Then our \(y\)-coordinate will be constant, always taking the value \(y=b\text{.}\) So we can think of the measured temperature as the function of one variable \(B(x) = f(x,b)\) and we will observe the rate of change of temperature
This is called the “partial derivative \(f\) with respect to \(x\) at \((a,b)\)” and is denoted \(\pdiff{f}{x}(a,b)\text{.}\) Here
the symbol \(\partial\text{,}\) which is read “partial”, indicates that we are dealing with a function of more than one variable, and
the \(x\) in \({\pdiff{f}{x}}\) indicates that we are differentiating with respect to \(x\text{,}\) while \(y\) is being held fixed, i.e. being treated as a constant.
\({\pdiff{f}{x}}\) is read “partial dee \(f\) dee \(x\)”.
Do not write \(\diff{}{x}\) when \(\pdiff{}{x}\) is appropriate. We shall later encounter situations when \(\diff{}{x}f\) and \(\pdiff{}{x}f\) are both defined and have different meanings.
If, instead, we are passing through the point \((x,y)=(a,b)\) and are walking parallel to the \(y\)-axis (in the positive direction), then our \(x\)-coordinate will be constant, always taking the value \(x=a\text{.}\) So we can think of the measured temperature as the function of one variable \(A(y) = f(a,y)\) and we will observe the rate of change of temperature
This is called the “partial derivative \(f\) with respect to \(y\) at \((a,b)\)” and is denoted \(\pdiff{f}{y}(a,b)\text{.}\)
Just as was the case for the ordinary derivative \(\diff{f}{x}(x)\) (see Definition 2.2.6 in the CLP-1 text), it is common to treat the partial derivatives of \(f(x,y)\) as functions of \((x,y)\) simply by evaluating the partial derivatives at \((x,y)\) rather than at \((a,b)\text{.}\)
Definition2.2.1.Partial Derivatives.
The \(x\)- and \(y\)-partial derivatives of the function \(f(x,y)\) are
The subscript \(1\) on \(D_1 f\) indicates that \(f\) is being differentiated with respect to its first variable. The partial derivative \(\pdiff{f}{x}(a,b)\) is also denoted
in terms of the shape of the graph \(z=f(x,y)\) of the function \(f(x,y)\text{.}\) That graph appears in the figure below. It looks like the part of a deformed sphere that is in the first octant.
The definition of \(\pdiff{f}{x}(a,b)\) concerns only points on the graph that have \(y=b\text{.}\) In other words, the curve of intersection of the surface \(z=f(x,y)\) with the plane \(y=b\text{.}\) That is the red curve in the figure. The two blue vertical line segments in the figure have heights \(f(a,b)\) and \(f(a+h,b)\text{,}\) which are the two numbers in the numerator of \(\frac{f(a+h,b) - f(a,b)}{h}\text{.}\)
A side view of the curve (looking from the left side of the \(y\)-axis) is sketched in the figure below.
Again, the two blue vertical line segments in the figure have heights \(f(a,b)\) and \(f(a+h,b)\text{,}\) which are the two numbers in the numerator of \(\frac{f(a+h,b) - f(a,b)}{h}\text{.}\) So the numerator \(f(a+h,b) - f(a,b)\) and denominator \(h\) are the rise and run, respectively, of the curve \(z=f(x,b)\) from \(x=a\) to \(x=a+h\text{.}\) Thus \(\pdiff{f}{x}(a,b)\) is exactly the slope of (the tangent to) the curve of intersection of the surface \(z=f(x,y)\) and the plane \(y=b\) at the point \(\big(a,b, f(a,b)\big)\text{.}\) In the same way \(\pdiff{f}{y}(a,b)\) is exactly the slope of (the tangent to) the curve of intersection of the surface \(z=f(x,y)\) and the plane \(x=a\) at the point \(\big(a,b, f(a,b)\big)\text{.}\)
Subsection2.2.1Evaluation of Partial Derivatives
From the above discussion, we see that we can readily compute partial derivatives \(\pdiff{}{x}\) by using what we already know about ordinary derivatives \(\diff{}{x}\text{.}\) More precisely,
to evaluate \(\pdiff{f}{x}(x,y)\text{,}\) treat the \(y\) in \(f(x,y)\) as a constant and differentiate the resulting function of \(x\) with respect to \(x\text{.}\)
To evaluate \(\pdiff{f}{y}(x,y)\text{,}\) treat the \(x\) in \(f(x,y)\) as a constant and differentiate the resulting function of \(y\) with respect to \(y\text{.}\)
To evaluate \(\pdiff{f}{x}(a,b)\text{,}\) treat the \(y\) in \(f(x,y)\) as a constant and differentiate the resulting function of \(x\) with respect to \(x\text{.}\) Then evaluate the result at \(x=a\text{,}\)\(y=b\text{.}\)
To evaluate \(\pdiff{f}{y}(a,b)\text{,}\) treat the \(x\) in \(f(x,y)\) as a constant and differentiate the resulting function of \(y\) with respect to \(y\text{.}\) Then evaluate the result at \(x=a\text{,}\)\(y=b\text{.}\)
Now here is a more complicated example — our function takes a special value at \((0,0)\text{.}\) To compute derivatives there we revert to the definition.
If \(b\ne a\text{,}\) then for all \((x,y)\) sufficiently close to \((a,b)\text{,}\)\(f(x,y) = \frac{\cos x-\cos y}{x-y}\) and we can compute the partial derivatives of \(f\) at \((a,b)\) using the familiar rules of differentiation. However that is not the case for \((a,b)=(0,0)\text{.}\) To evaluate \(f_x(0,0)\text{,}\) we need to set \(y=0\) and find the derivative of
We'll now compute \(f_y(x,y)\) for all \((x,y)\text{.}\)
The case \(y\ne x\text{:}\) When \(y\ne x\text{,}\)
\begin{align*}
f_y(x,y) & = \pdiff{}{y}\frac{\cos x-\cos y}{x-y}\\
&=\frac{(x-y)\pdiff{}{y}(\cos x-\cos y)
- (\cos x-\cos y)\pdiff{}{y}(x-y) }{(x-y)^2}\\
&\hskip2in\text{(by the quotient rule)}\\
&=\frac{(x-y)\sin y
+ \cos x-\cos y }{(x-y)^2}
\end{align*}
The case \(y= x\text{:}\) When \(y = x\text{,}\)
\begin{align*}
f_y(x,y) &= \lim_{h\rightarrow 0}\frac{f(x,y+h)-f(x,y)}{h}\\
&= \lim_{h\rightarrow 0}\frac{f(x,x+h)-f(x,x)}{h}\\
&= \lim_{h\rightarrow 0}\frac{\frac{\cos x-\cos(x+h)}{x-(x+h)}-0}{h}
&\qquad\text{(Recall that $h\ne 0$ in the limit.)}\\
&= \lim_{h\rightarrow 0}\frac{\cos(x+h)-\cos x}{h^2}
\end{align*}
Now we apply L'Hôpital's rule, remembering that, in this limit, \(x\) is a constant and \(h\) is the variable — so we differentiate with respect to \(h\text{.}\)
Note that if \(x\) is not an integer multiple of \(\pi\text{,}\) then the numerator \(-\sin(x+h)\) does not tend to zero as \(h\) tends to zero, and the limit giving \(f_y(x,y)\) does not exist. On the other hand, if \(x\) is an integer multiple of \(\pi\text{,}\) both the numerator and denominator tend to zero as \(h\) tends to zero, and we can apply L'Hôpital's rule a second time. Then
\begin{equation*}
f_y(x,y)=\begin{cases}
\frac{(x-y)\sin y
+ \cos x-\cos y }{(x-y)^2}&\text{if } x\ne y\\
-\frac{\cos x}{2}&\text{if } x=y \text{ with } x \text{ an integer multiple of }\pi\\
DNE&\text{if } x=y \text{ with } x \text{ not an integer multiple of }\pi
\end{cases}
\end{equation*}
is not continuous at \((0,0)\) and yet has both partial derivatives \(f_x(0,0)\) and \(f_y(0,0)\) perfectly well defined. We'll also see how that is possible. First let's compute the partial derivatives. By definition,
So the first order partial derivatives \(f_x(0,0)\) and \(f_y(0,0)\) are perfectly well defined.
To see that, nonetheless, \(f(x,y)\) is not continuous at \((0,0)\text{,}\) we take the limit of \(f(x,y)\) as \((x,y)\) approaches \((0,0)\) along the curve \(y=x-x^3\text{.}\) The limit is
which does not exist. Indeed as \(x\) approaches \(0\) through positive numbers, \(\frac{1}{x}\) approaches \(+\infty\text{,}\) and as \(x\) approaches \(0\) through negative numbers, \(\frac{1}{x}\) approaches \(-\infty\text{.}\)
So how is this possible? The answer is that \(f_x(0,0)\) only involves values of \(f(x,y)\) with \(y=0\text{.}\) As \(f(x,0)=x\text{,}\) for all values of \(x\text{,}\) we have that \(f(x,0)\) is a continuous, and indeed a differentiable, function. Similarly, \(f_y(0,0)\) only involves values of \(f(x,y)\) with \(x=0\text{.}\) As \(f(0,y)=0\text{,}\) for all values of \(y\text{,}\) we have that \(f(0,y)\) is a continuous, and indeed a differentiable, function. On the other hand, the bad behaviour of \(f(x,y)\) for \((x,y)\) near \((0,0)\) only happens for \(x\) and \(y\) both nonzero.
for all \(x\) and \(y\text{.}\) We can turn this into an equation for \(\pdiff{z}{x}(0,0)\) by differentiating 4 the whole equation with respect to \(x\text{,}\) giving
The critical observation is that, in taking the limit \(z\rightarrow 0\text{,}\)\(x\) and \(y\) are fixed. They do not change as \(z\) is getting smaller and smaller. Furthermore this limit is exactly of the form of the limits in the Definition 2.2.1 of partial derivative, disguised by some obfuscating changes of notation.
Recalling that \(\pdiff{}{z}\) treats \(x\) and \(y\) as constants, we are evaluating the derivative of a function of the form \(\frac{({\rm const}+z)^3}{\rm const}\text{.}\) So
In this example we are going to see that, in contrast to the ordinary derivative case, \(\pdiff{r}{x}\) is not, in general, the same as \(\big(\pdiff{x}{r}\big)^{-1}\text{.}\)
Recall that Cartesian and polar coordinates 5 (for \((x,y)\ne (0,0)\) and \(r \gt 0\)) are related by
Here we have just renamed the \(h\) of Definition 2.2.1 to \(\dee{r}\) and to \(\dee{x}\) in the two definitions.
In computing \(\pdiff{x}{r}(r_0,\theta_0)\text{,}\)\(\theta_0\) is held fixed, \(r\) is changed by a small amount \(\dee{r}\) and the resulting \(\dee{x}=x(r_0+\dee{r},\theta_0) - x(r_0,\theta_0)\) is computed. In the figure on the left below, \(\dee{r}\) is the length of the orange line segment and \(\dee{x}\) is the length of the blue line segment.
On the other hand, in computing \(\pdiff{r}{x}\text{,}\)\(y\) is held fixed, \(x\) is changed by a small amount \(\dee{x}\) and the resulting \(\dee{r}=r(x_0+\dee{x},y_0) - r(x_0,y_0)\) is computed. In the figure on the right above, \(\dee{x}\) is the length of the pink line segment and \(\dee{r}\) is the length of the orange line segment.
Here are the two figures combined together. We have arranged that the same \(\dee{r}\) is used in both computations. In order for the \(\dee{r}\)'s to be the same in both computations, the two \(\dee{x}\)'s have to be different (unless \(\theta_0=0,\pi\)). So, in general, \(\pdiff{x}{r}(r_0,\theta_0)\ne \big(\pdiff{r}{x}(x_0,y_0)\big)^{-1}\text{.}\)
The inverse function theorem, for functions of one variable, says that, if \(y(x)\) and \(x(y)\) are inverse functions, meaning that \(y\big(x(y)\big)=y\) and \(x\big(y(x)\big)=x\text{,}\) and are differentiable with \(\diff{y}{x}\ne 0\text{,}\) then
To see this, just apply \(\diff{}{y}\) to both sides of \(y\big(x(y)\big)=y\) to get \(\diff{y}{x}\big(x(y)\big)\ \diff{x}{y}(y)=1\text{,}\) by the chain rule (see Theorem 2.9.3 in the CLP-1 text). In the CLP-1 text, we used this to compute the derivatives of the logarithm (see Theorem 2.10.1 in the CLP-1 text) and of the inverse trig functions (see Theorem 2.12.7 in the CLP-1 text).
We have just seen, in Example 2.2.12, that we can't be too naive in extending the single variable inverse function theorem to functions of two (or more) variables. On the other hand, there is such an extension, which we will now illustrate, using Cartesian and polar coordinates. For simplicity, we'll restrict our attention to \(x \gt 0\text{,}\)\(y \gt 0\text{,}\) or equivalently, \(r \gt 0\text{,}\)\(0 \lt \theta \lt \frac{\pi}{2}\text{.}\) The functions which convert between Cartesian and polar coordinates are
The two functions on the left convert from polar to Cartesian coordinates and the two functions on the right convert from Cartesian to polar coordinates. The inverse function theorem (for functions of two variables) says that,
if you form the first order partial derivatives of the left hand functions into the matrix
This two variable version of the inverse function theorem can be derived by applying the derivatives \(\pdiff{}{r}\) and \(\pdiff{}{\theta}\) to the equations
and using the two variable version of the chain rule, which we will see in §2.4.
Exercises2.2.2Exercises
Exercise Group.
Exercises — Stage 1
1.
Let \(f(x,y) = e^x\cos y\text{.}\) The following table gives some values of \(f(x,y)\text{.}\)
\(x=0\)
\(x=0.01\)
\(x=0.1\)
\(y=-0.1\)
0.99500
1.00500
1.09965
\(y=-0.01\)
0.99995
1.01000
1.10512
\(y=0\)
1.0
1.01005
1.10517
Find two different approximate values for \(\pdiff{f}{x}(0,0)\) using the data in the above table.
Find two different approximate values for \(\pdiff{f}{y}(0,0)\) using the data in the above table.
Evaluate \(\pdiff{f}{x}(0,0)\) and \(\pdiff{f}{y}(0,0)\) exactly.
2.
You are traversing an undulating landscape. Take the \(z\)-axis to be straight up towards the sky, the positive \(x\)-axis to be due south, and the positive \(y\)-axis to be due east. Then the landscape near you is described by the equation \(z=f(x,y)\text{,}\) with you at the point \((0,0,f(0,0))\text{.}\) The function \(f(x,y)\) is differentiable.
Suppose \(f_y(0,0) \lt 0\text{.}\) Is it possible that you are at a summit? Explain.
Suppose that \(u = x^2 + yz\text{,}\)\(x = \rho r \cos(\theta)\text{,}\)\(y = \rho r \sin(\theta)\) and \(z = \rho r\text{.}\) Find \(\pdiff{u}{r}\) at the point \((\rho_0 , r_0 , \theta_0) = (2, 3, \pi/2)\text{.}\)
9.
Use the definition of the derivative to evaluate \(f_x(0,0)\) and \(f_y(0,0)\) for
Evaluate, if possible, \(\pdiff{f}{x}(0,0)\) and \(\pdiff{f}{y}(0,0)\text{.}\)
Is \(f(x,y)\) continuous at \((0,0)\text{?}\)
12.
Consider the cylinder whose base is the radius-1 circle in the \(xy\)-plane centred at \((0,0)\text{,}\) and which slopes parallel to the line in the \(yz\)-plane given by \(z=y\text{.}\)
When you stand at the point \((0,-1,0)\text{,}\) what is the slope of the surface if you look in the positive \(y\) direction? The positive \(x\) direction?
There are applications in which there are several variables that cannot be varied independently. For example, the pressure, volume and temperature of an ideal gas are related by the equation of state \(PV= \text{(constant)} T\text{.}\) In those applications, it may not be clear from the context which variables are being held fixed.
It is also possible to evaluate the derivative by using the technique of the optional Section 2.15 in the CLP-1 text.
The only real number \(z\) which obeys \(z^5=-1\) is \(z=-1\text{.}\) However there are four other complex numbers which also obey \(z^5=-1\text{.}\)
You should have already seen this technique, called implicit differentiation, in your first Calculus course. It is covered in Section 2.11 in the CLP-1 text.
If you are not familiar with polar coordinates, don't worry about it. There will be an introduction to them in §3.2.1.
Matrix multiplication is usually covered in courses on linear algebra, which you may or may not have taken. That's why this example is optional.