Skip to main content

Subsection 1.11.4 Three Simple Numerical Integrators — Error Behaviour

Now we are armed with our three (relatively simple) methods for numerical integration we should give thought to how practical they might be in the real world  7 Indeed, even beyond the “real world” of many applications in first year calculus texts, some of the methods we have described are used by actual people (such as ship builders, engineers and surveyors) to estimate areas and volumes of actual objects!. Two obvious considerations when deciding whether or not a given algorithm is of any practical value are

  1. the amount of computational effort required to execute the algorithm and
  2. the accuracy that this computational effort yields.

For algorithms like our simple integrators, the bulk of the computational effort usually goes into evaluating the function \(f(x)\text{.}\) The number of evaluations of \(f(x)\) required for \(n\) steps of the midpoint rule is \(n\text{,}\) while the number required for \(n\) steps of the trapezoidal and Simpson's rules is \(n+1\text{.}\) So all three of our rules require essentially the same amount of effort — one evaluation of \(f(x)\) per step.

To get a first impression of the error behaviour of these methods, we apply them to a problem whose answer we know exactly:

\begin{gather*} \int_0^\pi\sin x\,\dee{x}=-\cos x\big|_0^\pi = 2. \end{gather*}

To be a little more precise, we would like to understand how the errors of the three methods change as we increase the effort we put in (as measured by the number of steps \(n\)). The following table lists the error in the approximate value for this number generated by our three rules applied with three different choices of \(n\text{.}\) It also lists the number of evaluations of \(f\) required to compute the approximation.

Midpoint Trapezoidal Simpson's
n error # evals error # evals error # evals
10 \(8.2\times 10^{-3}\) 10 \(1.6\times 10^{-2}\) 11 \(1.1\times 10^{-4}\) 11
100 \(8.2\times 10^{-5}\) 100 \(1.6\times 10^{-4}\) 101 \(1.1\times 10^{-8}\) 101
1000 \(8.2\times 10^{-7}\) 1000 \(1.6\times 10^{-6}\) 1001 \(1.1\times 10^{-12}\) 1001

Observe that

  • Using 101 evaluations of \(f\) worth of Simpson's rule gives an error 75 times smaller than 1000 evaluations of \(f\) worth of the midpoint rule.
  • The trapezoidal rule error with \(n\) steps is about twice the midpoint rule error with \(n\) steps.
  • With the midpoint rule, increasing the number of steps by a factor of 10 appears to reduce the error by about a factor of \(100=10^2=n^2\text{.}\)
  • With the trapezoidal rule, increasing the number of steps by a factor of 10 appears to reduce the error by about a factor of \(10^2=n^2\text{.}\)
  • With Simpson's rule, increasing the number of steps by a factor of 10 appears to reduce the error by about a factor of \(10^4=n^4\text{.}\)

So it looks like

\begin{align*} &\hbox{approx value of $\ds\int_a^b f(x)\,\dee{x}$ given by $n$ midpoint steps } \hskip-0.35in&&\approx \int_a^b f(x)\,\dee{x}+K_M\cdot \frac{1}{n^2}\\ &\hbox{approx value of $\ds\int_a^b f(x)\,\dee{x}$ given by $n$ trapezoidal steps } \hskip-0.35in&&\approx \int_a^b f(x)\,\dee{x}+K_T\cdot \frac{1}{n^2}\\ &\hbox{approx value of $\ds\int_a^b f(x)\,\dee{x}$ given by $n$ Simpson's steps } \hskip-0.35in&&\approx \int_a^b f(x)\,\dee{x}+K_M\cdot \frac{1}{n^4} \end{align*}

with some constants \(K_M,\ K_T\) and \(K_S\text{.}\) It also seems that \(K_T\approx 2 K_M\text{.}\)

Figure 1.11.12 A log-log plot of the error in the \(n\) step approximation to \(\ds \int_0^\pi \sin x\,\dee{x}\text{.}\)

To test these conjectures for the behaviour of the errors we apply our three rules with about ten different choices of \(n\) of the form \(n=2^m \) with \(m\) integer. Figure 1.11.12 contains two graphs of the results. The left-hand plot shows the results for the midpoint and trapezoidal rules and the right-hand plot shows the results for Simpson's rule.

For each rule we are expecting (based on our conjectures above) that the error

\begin{align*} e_n &= | \text{exact value } - \text{ approximate value}| \end{align*}

with \(n\) steps is (roughly) of the form

\begin{gather*} e_n=K\frac{1}{n^k} \end{gather*}

for some constants \(K\) and \(k\text{.}\) We would like to test if this is really the case, by graphing \(Y=e_n\) against \(X=n\) and seeing if the graph “looks right”. But it is not easy to tell whether or not a given curve really is \(Y=\frac{K}{X^k}\text{,}\) for some specific \(k\text{,}\) by just looking at it. However, your eye is pretty good at determining whether or not a graph is a straight line. Fortunately, there is a little trick that turns the curve \(Y=\tfrac{K}{X^k}\) into a straight line — no matter what \(k\) is.

Instead of plotting \(Y\) against \(X\text{,}\) we plot \(\log Y\) against \(\log X\text{.}\) This transformation  8 There is a variant of this trick that works even when you don't know the answer to the integral ahead of time. Suppose that you suspect that the approximation satisfies \(M_n=A+K\tfrac{1}{n^k}\) where \(A\) is the exact value of the integral and suppose that you don't know the values of \(A\text{,}\) \(K\) and \(k\text{.}\) Then \(M_{n}-M_{2n} =K\tfrac{1}{n^k}-K\tfrac{1}{(2n)^k} =K\big(1-\tfrac{1}{2^k}\big)\tfrac{1}{n^k}\) so plotting \(y=\log(M_{n}-M_{2n})\) against \(x=\log n\) gives the straight line \(y=\log \big[K\big(1-\frac{1}{2^k}\big)\big] -kx\text{.}\) works because when \(Y=\frac{K}{X^k}\)

\begin{align*} \log Y &= \log K - k \log X \end{align*}

So plotting \(y=\log Y\) against \(x=\log X\) gives the straight line \(y=\log K -kx\text{,}\) which has slope \(-k\) and \(y\)-intercept \(\log K\text{.}\)

The three graphs in Figure 1.11.12 plot \(y=\log_2 e_n\) against \(x=\log_2 n\) for our three rules. Note that we have chosen to use logarithms  9 Now is a good time for a quick revision of logarithms — see “Whirlwind review of logarithms” in Section 2.7 of the CLP-1 text. with this “unusual base” because it makes it very clear how much the error is improved if we double the number of steps used. To be more precise — one unit step along the \(x\)-axis represents changing \(n \mapsto 2n\text{.}\) For example, applying Simpson's rule with \(n=2^4\) steps results in an error of \(0000166\text{,}\) so the point \((x=\log_2 2^4=4, y=\log_2 0000166 = \frac{\log 0000166}{\log 2} = -15.8)\) has been included on the graph. Doubling the effort used — that is, doubling the number of steps to \(n=2^5\)— results in an error of \(0.00000103\text{.}\) So, the data point \((x=\log_2 2^5=5\ ,\ y=\log_2 0.00000103 =\frac{\ln 0.00000103}{\ln 2}=-19.9)\) lies on the graph. Note that the \(x\)-coordinates of these points differ by 1 unit.

For each of the three sets of data points, a straight line has also been plotted “through” the data points. A procedure called linear regression  10 Linear regression is not part of this course as its derivation requires some multivariable calculus. It is a very standard technique in statistics. has been used to decide precisely which straight line to plot. It provides a formula for the slope and \(y\)-intercept of the straight line which “best fits” any given set of data points. From the three lines, it sure looks like \(k=2\) for the midpoint and trapezoidal rules and \(k=4\) for Simpson's rule. It also looks like the ratio between the value of \(K\) for the trapezoidal rule, namely \(K=2^{0.7253}\text{,}\) and the value of \(K\) for the midpoint rule, namely \(K=2^{-0.2706}\text{,}\) is pretty close to 2: \(2^{0.7253}/2^{-0.2706}=2^{0.9959}\text{.}\)

The intuition, about the error behaviour, that we have just developed is in fact correct — provided the integrand \(f(x)\) is reasonably smooth. To be more precise

The first of these error bounds in proven in the following (optional) section. Here are some examples which illustrate how they are used. First let us check that the above result is consistent with our data in Figure 1.11.12

  • The integral \(\int_0^\pi \sin x\,\dee{x}\) has \(b-a=\pi\text{.}\)
  • The second derivative of the integrand satisfies
    \begin{align*} \left|\ddiff{2}{}{x}\sin x\right| &= |-\sin x| \leq 1 \end{align*}
    So we take \(M=1\text{.}\)
  • So the error, \(e_n\text{,}\) introduced when \(n\) steps are used is bounded by
    \begin{align*} |e_n|&\le\frac{M}{24}\frac{(b-a)^3}{n^2}\\ &=\frac{\pi^3}{24}\frac{1}{n^2}\\ &\approx 1.29\frac{1}{n^2} \end{align*}
  • The data in the graph in Figure 1.11.12 gives
    \begin{align*} |e_n| &\approx 2^{-.2706}\frac{1}{n^2}=0.83\frac{1}{n^2} \end{align*}
    which is consistent with the bound \(|e_n|\le \frac{\pi^3}{24}\frac{1}{n^2}\text{.}\)

In a typical application we would be asked to evaluate a given integral to some specified accuracy. For example, if you are manufacturer and your machinery can only cut materials to an accuracy of \({\tfrac{1}{10}}^{\rm th}\) of a millimeter, there is no point in making design specifications more accurate than \({\tfrac{1}{10}}^{\rm th}\) of a millimeter.

Suppose, for example, that we wish to use the midpoint rule to evaluate  11 This is our favourite running example of an integral that cannot be evaluated algebraically — we need to use numerical methods.

\begin{gather*} \int_0^1 e^{-x^2}\dee{x} \end{gather*}

to within an accuracy of \(10^{-6}\text{.}\)

Solution:

  • The integral has \(a=0\) and \(b=1\text{.}\)
  • The first two derivatives of the integrand are
    \begin{align*} \diff{}{x}e^{-x^2}&=-2xe^{-x^2} \hskip2in \text{and}\\ \ddiff{2}{}{x}e^{-x^2} &=\diff{}{x}\big(-2xe^{-x^2}\big) =-2e^{-x^2}+4x^2e^{-x^2}=2(2x^2-1)e^{-x^2} \end{align*}
  • As \(x\) runs from 0 to 1, \(2x^2-1\) increases from \(-1\) to \(1\text{,}\) so that
    \begin{gather*} 0\le x\le 1\implies |2x^2-1|\le 1,\ e^{-x^2}\le 1\implies \big|2(2x^2-1)e^{-x^2}\big|\le 2 \end{gather*}
    So we take \(M=2\text{.}\)
  • The error introduced by the \(n\) step midpoint rule is at most
    \begin{align*} e_n & \leq \frac{M}{24}\frac{(b-a)^3}{n^2}\\ &\leq \frac{2}{24}\frac{(1-0)^3}{n^2} = \frac{1}{12n^2} \end{align*}
  • We need this error to be smaller than \(10^{-6}\) so
    \begin{align*} e_n & \leq \frac{1}{12n^2} \leq 10^{-6} & \text{and so }\\ 12n^2 &\geq 10^6 & \text{clean up}\\ n^2 &\geq \frac{10^6}{12} = 83333.3 & \text{square root both sides}\\ n &\geq 288.7 \end{align*}
    So \(289\) steps of the midpoint rule will do the job.
  • In fact \(n=289\) results in an error of about \(3.7\times 10^{-7}\text{.}\)

That seems like far too much work, and the trapezoidal rule will have twice the error. So we should look at Simpson's rule.

Suppose now that we wish evaluate \(\int_0^1 e^{-x^2}\,\dee{x}\) to within an accuracy of \(10^{-6}\) — but now using Simpson's rule. How many steps should we use?

Solution:

  • Again we have \(a=0,b=1\text{.}\)
  • We then need to bound \(\ddiff{4}{}{x}e^{-x^2}\) on the domain of integration, \(0\leq x\leq 1\text{.}\)
    \begin{align*} \ddiff{3}{}{x}e^{-x^2} &=\diff{}{x}\big\{2(2x^2-1)e^{-x^2}\big\} =8xe^{-x^2}-4x(2x^2-1)e^{-x^2}\\ &=4(-2x^3+3x)e^{-x^2}\\ \ddiff{4}{}{x}e^{-x^2} &=\diff{}{x}\big\{4(-2x^3+3x)e^{-x^2}\big\}\\ &=4(-6x^2+3)e^{-x^2}\hskip-4pt-8x(-2x^3+3x)e^{-x^2}\\ &=4(4x^4-12x^2+3)e^{-x^2} \end{align*}
  • Now, for any \(x\text{,}\) \(e^{-x^2}\le 1\text{.}\) Also, for \(0\le x\le 1\text{,}\)
    \begin{align*} 0 & \leq x^2, x^4 \leq 1 & \text{ so}\\ 3 & \leq 4x^4+3 \leq 7 & \text{ and }\\ -12 & \leq -12x^2 \leq 0 & \text{adding these together gives}\\ -9 & \leq 4x^4-12x^2 + 3 \leq 7 \end{align*}
    Consequently, \(|4x^4-12x^2+3|\) is bounded by \(9\) and so
    \begin{gather*} \left|\ddiff{4}{}{x}e^{-x^2}\right| \leq 4\times 9=36 \end{gather*}
    So take \(L=36\text{.}\)
  • The error introduced by the \(n\) step Simpson's rule is at most
    \begin{align*} e_n & \leq \frac{L}{180}\frac{(b-a)^5}{n^4}\\ & \leq \frac{36}{180} \frac{(1-0)^5}{n^4} = \frac{1}{5n^4} \end{align*}
  • In order for this error to be no more than \(10^{-6}\) we require \(n\) to satisfy
    \begin{align*} e_n &\leq \frac{1}{5n^4} \leq 10^{-6} & \text{and so}\\ 5n^4 & \geq 10^6\\ n^4 &\geq 200000 & \text{take fourth root}\\ n & \geq 21.15 \end{align*}
    So \(22\) steps of Simpson's rule will do the job.
  • \(n=22\) steps actually results in an error of \(3.5\times 10^{-8}\text{.}\) The reason that we get an error so much smaller than we need is that we have overestimated the number of steps required. This, in turn, occurred because we made quite a rough bound of \(\left|\ddiff{4}{}{x}f(x)\right|\leq 36\text{.}\) If we are more careful then we will get a slightly smaller \(n\text{.}\) It actually turns out  12 The authors tested this empirically. that you only need \(n=10\) to approximate within \(10^{-6}\text{.}\)