using CalculusWithJulia
using Plots
plotly()
using QuadGK
using Roots
36 Area under a curve
This section uses these add-on packages:
The question of area has long fascinated human culture. As children, we learn early on the formulas for the areas of some geometric figures: a square is \(b^2\), a rectangle \(b\cdot h\) a triangle \(1/2 \cdot b \cdot h\) and for a circle, \(\pi r^2\). The area of a rectangle is often the intuitive basis for illustrating multiplication. The area of a triangle has been known for ages. Even complicated expressions, such as Heron’s formula which relates the area of a triangle with measurements from its perimeter have been around for 2000 years. The formula for the area of a circle is also quite old. Wikipedia dates it as far back as the Rhind papyrus for 1700 BC, with the approximation of \(256/81\) for \(\pi\).
The modern approach to area begins with a non-negative function \(f(x)\) over an interval \([a,b]\). The goal is to compute the area under the graph. That is, the area between \(f(x)\) and the \(x\)-axis between \(a \leq x \leq b\).
For some functions, this area can be computed by geometry, for example, here we see the area under \(f(x)\) is just \(1\), as it is a triangle with base \(2\) and height \(1\):
f(x) = 1 - abs(x)
plot(f, -1, 1)
plot!(zero)
Similarly, we know this area is also \(1\), it being a square:
f(x) = 1
plot(f, 0, 1)
plot!(zero)
This one, is simply \(\pi/2\), it being half a circle of radius \(1\):
f(x) = sqrt(1 - x^2)
plot(f, -1, 1)
plot!(zero)
And this area can be broken into a sum of the area of square and the area of a triangle, or \(1 + 1/2\):
f(x) = x > 1 ? 2 - x : 1.0
plot(f, 0, 2)
plot!(zero)
But what of more complicated areas? Can these have their area computed?
36.1 Approximating areas
In a previous section, we saw this animation:
This illustrates a method of Archimedes to compute the area contained in a parabola using the method of exhaustion. Archimedes leveraged a fact he discovered relating the areas of triangle inscribed with parabolic segments to create a sum that could be computed.
The pursuit of computing areas persisted. The method of computing area by finding a square with an equivalent area was known as quadrature. Over the years, many figures had their area computed, for example, the area under the graph of the cycloid (…Galileo tried empirically to find this using a tracing on sheet metal and a scale).
However, as areas of geometric objects were replaced by the more general question of area related to graphs of functions, a more general study was called for.
One such approach is illustrated in this figure due to Beeckman from 1618 (from Bressoud)
Beeckman actually did more than find the area. He generalized the relationship of rate \(\times\) time \(=\) distance. The line was interpreting a velocity, the “squares”, then, provided an approximate distance traveled when the velocity is taken as a constant on the small time interval. Then the distance traveled can be approximated by a smaller quantity - just add the area of the rectangles squarely within the desired area (\(6+16+6\)) - and a larger quantity - by including all rectangles that have a portion of their area within the desired area (\(10 + 16 + 10\)). Beeckman argued that the error vanishes as the rectangles get smaller.
Adding up the smaller “squares” can be a bit more efficient if we were to add all those in a row, or column at once. We would then add the areas of a smaller number of rectangles. For this curve, the two approaches are basically identical. For other curves, identifying which squares in a row would be added is much more complicated (though useful), but for a curve generated by a function, identifying which “squares” go in a rectangle is quite easy, in fact we can see the rectangle’s area will be a base given by that of the squares, and height depending on the function.
36.1.1 Adding rectangles
The idea of the Riemann sum then is to approximate the area under the curve by the area of well-chosen rectangles in such a way that as the bases of the rectangles get smaller (hence adding more rectangles) the error in approximation vanishes.
Define a partition of \([a,b]\) to be a selection of points \(a = x_0 < x_1 < \cdots < x_{n-1} < x_n = b\). The norm of the partition is the largest of all the differences \(\lvert x_i - x_{i-1} \rvert\). For a partition, consider an arbitrary selection of points \(c_i\) satisfying \(x_{i-1} \leq c_i \leq x_{i}\), \(1 \leq i \leq n\). Then the following is a Riemann sum:
\[ S_n = f(c_1) \cdot (x_1 - x_0) + f(c_2) \cdot (x_2 - x_1) + \cdots + f(c_n) \cdot (x_n - x_{n-1}). \]
Clearly for a given partition and choice of \(c_i\), the above can be computed. Each term \(f(c_i)\cdot(x_i-x_{i-1})\) can be visualized as the area of a rectangle with base spanning from \(x_{i-1}\) to \(x_i\) and height given by the function value at \(c_i\). The following visualizes left Riemann sums for different values of \(n\) in a way that makes Beekman’s intuition plausible – that as the number of rectangles gets larger, the approximate sum will get closer to the actual area.
To successfully compute a good approximation for the area, we would need to choose \(c_i\) and the partition so that a formula can be found to express the dependence on the size of the partition.
For Archimedes’ problem - finding the area under \(f(x)=x^2\) between \(0\) and \(1\) - if we take as a partition \(x_i = i/n\) and \(c_i = x_i\), then the above sum becomes:
\[ \begin{align*} S_n &= f(c_1) \cdot (x_1 - x_0) + f(c_2) \cdot (x_2 - x_1) + \cdots + f(c_n) \cdot (x_n - x_{n-1})\\ &= (x_1)^2 \cdot \frac{1}{n} + (x_2)^2 \cdot \frac{1}{n} + \cdots + (x_n)^2 \cdot \frac{1}{n}\\ &= 1^2 \cdot \frac{1}{n^3} + 2^2 \cdot \frac{1}{n^3} + \cdots + n^2 \cdot \frac{1}{n^3}\\ &= \frac{1}{n^3} \cdot (1^2 + 2^2 + \cdots + n^2) \\ &= \frac{1}{n^3} \cdot \frac{n\cdot(n-1)\cdot(2n+1)}{6}. \end{align*} \]
The latter uses a well-known formula for the sum of squares of the first \(n\) natural numbers.
With this expression, it is readily seen that as \(n\) gets large this value gets close to \(2/6 = 1/3\).
The above approach, like Archimedes’, ends with a limit being taken. The answer comes from using a limit to add a big number of small values. As with all limit questions, worrying about whether a limit exists is fundamental. For this problem, we will see that for the general statement there is a stretching of the formal concept of a limit.
There is a more compact notation to \(x_1 + x_2 + \cdots + x_n\), this using the summation notation or capital sigma. We have:
\[ \Sigma_{i = 1}^n x_i = x_1 + x_2 + \cdots + x_n \]
The notation includes three pieces of information:
- The \(\Sigma\) is an indication of a sum
- The \({i=1}\) and \(n\) sub- and superscripts indicate the range to sum over.
- The term \(x_i\) is a general term describing the \(i\)th entry, where it is understood that \(i\) is just some arbitrary indexing value.
With this notation, a Riemann sum can be written as \(\Sigma_{i=1}^n f(c_i)(x_i-x_{i-1})\).
36.1.2 Other sums
The choice of the \(c_i\) will give different answers for the approximation, though for an integrable function these differences will vanish in the limit. Some common choices are:
- Using the right hand endpoint of the interval \([x_{i-1}, x_i]\) giving the right-Riemann sum, \(R_n\).
- The choice \(c_i = x_{i-1}\) gives the left-Riemann sum, \(L_n\).
- The choice \(c_i = (x_i + x_{i-1})/2\) is the midpoint rule, \(M_n\).
- If the function is continuous on the closed subinterval \([x_{i-1}, x_i]\), then it will take on its minimum and maximum values. By the extreme value theorem, we could take \(c_i\) to correspond to either the maximum or the minimum. These choices give the “upper Riemann-sums” and “lower Riemann-sums”.
The choice of partition can also give different answers. A common choice is to break the interval into \(n+1\) equal-sized pieces. With \(\Delta = (b-a)/n\), these pieces become the arithmetic sequence \(a = a + 0 \cdot \Delta < a + 1 \cdot \Delta < a + 2 \cdot \Delta < \cdots < a + n \cdot \Delta = b\) with \(x_i = a + i (b-a)/n\). (The range(a, b, length=n+1)
command will compute these.) An alternate choice made below for one problem is to use a geometric progression:
\[ a = a(1+\alpha)^0 < a(1+\alpha)^1 < a (1+\alpha)^2 < \cdots < a (1+\alpha)^n = b. \]
The general statement allows for any partition such that the largest gap goes to \(0\).
Riemann sums weren’t named after Riemann because he was the first to approximate areas using rectangles. Indeed, others had been using even more efficient ways to compute areas for centuries prior to Riemann’s work. Rather, Riemann put the definition of the area under the curve on a firm theoretical footing with the following theorem which gives a concrete notion of what functions are integrable:
The expression \(V = \int_a^b f(x) dx\) is known as the definite integral of \(f\) over \([a,b]\). Much earlier than Riemann, Cauchy had defined the definite integral in terms of a sum of rectangular products beginning with \(S=(x_1 - x_0) f(x_0) + (x_2 - x_1) f(x_1) + \cdots + (x_n - x_{n-1}) f(x_{n-1})\) (the left Riemann sum). He showed the limit was well defined for any continuous function. Riemann’s formulation relaxes the choice of partition and the choice of the \(c_i\) so that integrability can be better understood.
36.1.3 Some immediate consequences
The following formulas are consequences when \(f(x)\) is integrable. These mostly follow through a judicious rearranging of the approximating sums.
The area is \(0\) when there is no width to the interval to integrate over:
\[ \int_a^a f(x) dx = 0. \]
Even our definition of a partition doesn’t really apply, as we assume \(a < b\), but clearly if \(a=x_0=x_n=b\) then our only”approximating” sum could be \(f(a)(b-a) = 0\).
The area under a constant function is found from the area of rectangle, a special case being \(c=0\) yielding \(0\) area:
\[ \int_a^b c dx = c \cdot (b-a). \]
For any partition of \(a < b\), we have \(S_n = c(x_1 - x_0) + c(x_2 -x_1) + \cdots + c(x_n - x_{n-1})\). By factoring out the \(c\), we have a telescoping sum which means the sum simplifies to \(S_n = c(x_n-x_0) = c(b-a)\). Hence any limit must be this constant value.
Scaling the \(y\) axis by a constant can be done before or after computing the area:
\[ \int_a^b cf(x) dx = c \int_a^b f(x) dx. \]
Let \(a=x_0 < x_1 < \cdots < x_n=b\) be any partition. Then we have \(S_n= cf(c_1)(x_1-x_0) + \cdots + cf(c_n)(x_n-x_{n-1})\) \(=\) \(c\cdot\left[ f(c_1)(x_1 - x_0) + \cdots + f(c_n)(x_n - x_{n-1})\right]\). The “limit” of the left side is \(\int_a^b c f(x) dx\). The “limit” of the right side is \(c \cdot \int_a^b f(x)\). We call this a “sketch” as a formal proof would show that for any \(\epsilon\) we could choose a \(\delta\) so that any partition with norm \(\delta\) will yield a sum less than \(\epsilon\). Here, then our “any” partition would be one for which the \(\delta\) on the left hand side applies. The computation shows that the same \(\delta\) would apply for the right hand side when \(\epsilon\) is the same.
The area is invariant under shifts left or right.
\[ \int_a^b f(x - c) dx = \int_{a-c}^{b-c} f(x) dx. \]
Any partition \(a =x_0 < x_1 < \cdots < x_n=b\) is related to a partition of \([a-c, b-c]\) through \(a-c < x_0-c < x_1-c < \cdots < x_n - c = b-c\). Let \(d_i=c_i-c\) denote this partition, then we have:
\[ \begin{align*} f(c_1 -c) \cdot (x_1 - x_0) &+ f(c_2 -c) \cdot (x_2 - x_1) + \cdots\\ &\quad + f(c_n -c) \cdot (x_n - x_{n-1})\\ &= f(d_1) \cdot(x_1-c - (x_0-c)) + f(d_2) \cdot(x_2-c - (x_1-c)) + \cdots\\ &\quad + f(d_n) \cdot(x_n-c - (x_{n-1}-c)). \end{align*} \]
The left side will have a limit of \(\int_a^b f(x-c) dx\) the right would have a “limit” of \(\int_{a-c}^{b-c}f(x)dx\).
Similarly, reflections don’t effect the area under the curve, they just require a new parameterization:
\[ \int_a^b f(x) dx = \int_{-b}^{-a} f(-x) dx \]
The scaling operation \(g(x) = f(cx)\) has the following:
\[ \int_a^b f(c\cdot x) dx = \frac{1}{c} \int_{ca}^{cb}f(x) dx \]
The scaling operation shifts \(a\) to \(ca\) and \(b\) to \(cb\) so the limits of integration make sense. However, the area stretches by \(c\) in the \(x\) direction, so must contract by \(c\) in the \(y\) direction to stay in balance. Hence the factor of \(1/c\).
Combining two operations above, the operation \(g(x) = \frac{1}{h}f(\frac{x-c}{h})\) will leave the area between \(a\) and \(b\) under \(g\) the same as the area under \(f\) between \((a-c)/h\) and \((b-c)/h\).
The area between \(a\) and \(b\) can be broken up into the sum of the area between \(a\) and \(c\) and that between \(c\) and \(b\).
\[ \int_a^b f(x) dx = \int_a^c f(x) dx + \int_c^b f(x) dx. \]
For this, suppose we have a partition for both the integrals on the right hand side for a given \(\epsilon/2\) and \(\delta\). Combining these into a partition of \([a,b]\) will mean \(\delta\) is still the norm. The approximating sum will combine to be no more than \(\epsilon/2 + \epsilon/2\), so for a given \(\epsilon\), this \(\delta\) applies.
This is due to the area on the left and right of \(0\) being equivalent.
The “reversed” area is the same, only accounted for with a minus sign.
\[ \int_a^b f(x) dx = -\int_b^a f(x) dx. \]
A consequence of the last few statements is:
If \(f(x)\) is an even function, then \(\int_{-a}^a f(x) dx = 2 \int_0^a f(x) dx\). If \(f(x)\) is an odd function, then \(\int_{-a}^a f(x) dx = 0\).
If \(g\) bounds \(f\) then the area under \(g\) will bound the area under \(f\), in particular if \(f(x)\) is non negative, so will the area under \(f\) be non negative for any \(a < b\). (This assumes that \(g\) and \(f\) are integrable.)
If \(0 \leq f(x) \leq g(x)\) then \(\int_a^b f(x) dx \leq \int_a^b g(x) dx.\)
For any partition of \([a,b]\) and choice of \(c_i\), we have the term-by-term bound \(f(c_i)(x_i-x_{i-1}) \leq g(c_i)(x_i-x_{i-1})\) So any sequence of partitions that converges to the limits will have this inequality maintained for the sum.
36.1.4 Some known integrals
Using the definition, we can compute a few definite integrals:
\[ \int_a^b c dx = c \cdot (b-a). \]
\[ \int_a^b x dx = \frac{b^2}{2} - \frac{a^2}{2}. \]
This is just the area of a trapezoid with heights \(a\) and \(b\) and side length \(b-a\), or \(1/2 \cdot (b + a) \cdot (b - a)\). The right sum would be:
\[ \begin{align*} S &= x_1 \cdot (x_1 - x_0) + x_2 \cdot (x_2 - x_1) + \cdots x_n \cdot (x_n - x_{n-1}) \\ &= (a + 1\frac{b-a}{n}) \cdot \frac{b-a}{n} + (a + 2\frac{b-a}{n}) \cdot \frac{b-a}{n} + \cdots (a + n\frac{b-a}{n}) \cdot \frac{b-a}{n}\\ &= n \cdot a \cdot (\frac{b-a}{n}) + (1 + 2 + \cdots n) \cdot (\frac{b-a}{n})^2 \\ &= n \cdot a \cdot (\frac{b-a}{n}) + \frac{n(n+1)}{2} \cdot (\frac{b-a}{n})^2 \\ & \rightarrow a \cdot(b-a) + \frac{(b-a)^2}{2} \\ &= \frac{b^2}{2} - \frac{a^2}{2}. \end{align*} \]
\[ \int_a^b x^2 dx = \frac{b^3}{3} - \frac{a^3}{3}. \]
This is similar to the Archimedes case with \(a=0\) and \(b=1\) shown above.
\[ \int_a^b x^k dx = \frac{b^{k+1}}{k+1} - \frac{a^{k+1}}{k+1},\quad k \neq -1 \]
.
Cauchy showed this using a geometric series for the partition, not the arithmetic series \(x_i = a + i (b-a)/n\). The series defined by \(1 + \alpha = (b/a)^{1/n}\), then \(x_i = a \cdot (1 + \alpha)^i\). Here the bases \(x_{i+1} - x_i\) simplify to \(x_i \cdot \alpha\) and \(f(x_i) = (a\cdot(1+\alpha)^i)^k = a^k (1+\alpha)^{ik}\), or \(f(x_i)(x_{i+1}-x_i) = a^{k+1}\alpha[(1+\alpha)^{k+1}]^i\), so, using \(u=(1+\alpha)^{k+1}=(b/a)^{(k+1)/n}\), \(f(x_i) \cdot(x_{i+1} - x_i) = a^{k+1}\alpha u^i\). This gives
\[ \begin{align*} S &= a^{k+1}\alpha u^0 + a^{k+1}\alpha u^1 + \cdots + a^{k+1}\alpha u^{n-1}\\ &= a^{k+1} \cdot \alpha \cdot (u^0 + u^1 + \cdot u^{n-1}) \\ &= a^{k+1} \cdot \alpha \cdot \frac{u^n - 1}{u - 1}\\ &= (b^{k+1} - a^{k+1}) \cdot \frac{\alpha}{(1+\alpha)^{k+1} - 1} \\ &\rightarrow \frac{b^{k+1} - a^{k+1}}{k+1}. \end{align*} \]
\[ \int_a^b x^{-1} dx = \log(b) - \log(a), \quad (0 < a < b). \]
Again, Cauchy showed this using a geometric series. The expression \(f(x_i) \cdot(x_{i+1} - x_i)\) becomes just \(\alpha\). So the approximating sum becomes:
\[ S = f(x_0)(x_1 - x_0) + f(x_1)(x_2 - x_1) + \cdots + f(x_{n-1}) (x_n - x_{n-1}) = \alpha + \alpha + \cdots \alpha = n\alpha. \]
But, letting \(x = 1/n\), the limit above is just the limit of
\[ \lim_{x \rightarrow 0+} \frac{(b/a)^x - 1}{x} = \log(b/a) = \log(b) - \log(a). \]
(Using L’Hopital’s rule to compute the limit.)
Certainly other integrals could be computed with various tricks, but we won’t pursue this. There is another way to evaluate integrals using the forthcoming Fundamental Theorem of Calculus.
36.1.5 Some other consequences
The definition is defined in terms of any partition with its norm bounded by \(\delta\). If you know a function \(f\) is Riemann integrable, then it is enough to consider just a regular partition \(x_i = a + i \cdot (b-a)/n\) when forming the sums, as was done above. It is just that showing a limit for just this particular type of partition would not be sufficient to prove Riemann integrability.
The choice of \(c_i\) is arbitrary to allow for maximum flexibility. The Darboux integrals use the maximum and minimum over the subinterval. It is sufficient to prove integrability to show that the limit exists with just these choices.
Most importantly,
A continuous function on \([a,b]\) is Riemann integrable on \([a,b]\).
The main idea behind this is that the difference between the maximum and minimum values over a partition gets small. That is if \([x_{i-1}, x_i]\) is like \(1/n\) is length, then the difference between the maximum of \(f\) over this interval, \(M\), and the minimum, \(m\) over this interval will go to zero as \(n\) gets big. That \(m\) and \(M\) exists is due to the extreme value theorem, that this difference goes to \(0\) is a consequence of continuity. What is needed is that this value goes to \(0\) at the same rate – no matter what interval is being discussed – is a consequence of a notion of uniform continuity, a concept discussed in advanced calculus, but which holds for continuous functions on closed intervals. Armed with this, the Riemann sum for a general partition can be bounded by this difference times \(b-a\), which will go to zero. So the upper and lower Riemann sums will converge to the same value.
- A “jump”, or discontinuity of the first kind, is a value \(c\) in \([a,b]\) where \(\lim_{x \rightarrow c+} f(x)\) and \(\lim_{x \rightarrow c-}f(x)\) both exist, but are not equal. It is true that a function that is not continuous on \(I=[a,b]\), but only has discontinuities of the first kind on \(I\) will be Riemann integrable on \(I\).
For example, the function \(f(x) = 1\) for \(x\) in \([0,1]\) and \(0\) otherwise will be integrable, as it is continuous at all but two points, \(0\) and \(1\), where it jumps.
- Some functions can have infinitely many points of discontinuity and still be integrable. The example of \(f(x) = 1/q\) when \(x=p/q\) is rational, and \(0\) otherwise is often used as an example.
36.2 Numeric integration
The Riemann sum approach gives a method to approximate the value of a definite integral. We just compute an approximating sum for a large value of \(n\), so large that the limiting value and the approximating sum are close.
To see the mechanics, let’s again return to Archimedes’ problem and compute \(\int_0^1 x^2 dx\).
Let us fix some values:
= 0, 1
a, b f(x) = x^2
f (generic function with 1 method)
Then for a given \(n\) we have some steps to do: create the partition, find the \(c_i\), multiply the pieces and add up. Here is one way to do all this:
= 5
n = a:(b-a)/n:b # also range(a, b, length=n)
xs = diff(xs) # forms x2-x1, x3-x2, ..., xn-xn-1
deltas = xs[1:end-1] # finds left-hand end points. xs[2:end] would be right-hand ones. cs
0.0:0.2:0.8
Now to multiply the values. We want to sum the product f(cs[i]) * deltas[i]
, here is one way to do so:
sum(f(cs[i]) * deltas[i] for i in 1:length(deltas))
0.24000000000000002
Our answer is not so close to the value of \(1/3\), but what did we expect - we only used \(n=5\) intervals. Trying again with \(50,000\) gives us:
= 50_000
n = a:(b-a)/n:b
xs = diff(xs)
deltas = xs[1:end-1]
cs sum(f(cs[i]) * deltas[i] for i in 1:length(deltas))
0.3333233333999998
This value is about \(10^{-5}\) off from the actual answer of \(1/3\).
We should expect that larger values of \(n\) will produce better approximate values, as long as numeric issues don’t get involved.
Before continuing, we define a function to compute Riemann sums for us with an extra argument to specifying one of four methods for computing \(c_i\):
riemann(f, a, b, n; method="right") = riemann(f, range(a,b,n+1); method=method)
function riemann(f, xs; method="right")
= (left = (f,a,b) -> f(a),
Ms = (f,a,b) -> f(b),
right= (f,a,b) -> (f(a) + f(b))/2,
trapezoid = (f,a,b) -> (c = a/2 + b/2; (1/6) * (f(a) + 4*f(c) + f(b))),
simpsons
)_riemann(Ms[Symbol(method)], f, xs)
end
function _riemann(M, f, xs)
= zip(xs[1:end-1], xs[2:end])
xs′ sum(M(f, a, b) * (b-a) for (a,b) ∈ xs′)
end
(This function is defined in CalculusWithJulia
and need not be copied over if that package is loaded.)
With this, we can easily find an approximate answer. We wrote the function to use the familiar template action(function, arguments...)
, so we pass in a function and arguments to describe the problem (a
, b
, and n
and, optionally, the method
):
f(x) = exp(x)
riemann(f, 0, 5, 10) # S_10
187.324835773627
Or with more intervals in the partition
riemann(f, 0, 5, 50_000)
147.42052988337647
(The answer is \(e^5 - e^0 = 147.4131591025766\dots\), which shows that even \(50,000\) partitions is not enough to guarantee many digits of accuracy.)
36.3 “Negative” area
So far, we have had the assumption that \(f(x) \geq 0\), as that allows us to define the concept of area. We can define the signed area between \(f(x)\) and the \(x\) axis through the definite integral:
\[ A = \int_a^b f(x) dx. \]
The right hand side is defined whenever the Riemann limit exists and in that case we call \(f(x)\) Riemann integrable. (The definition does not suppose \(f\) is non-negative.)
Suppose \(f(a) = f(b) = 0\) for \(a < b\) and for all \(a < x < b\) we have \(f(x) < 0\). Then we can see easily from the geometry (or from the Riemann sum approximation) that
\[ \int_a^b f(x) dx = - \int_a^b \lvert f(x) \rvert dx. \]
If we think of the area below the \(x\) axis as “signed” area carrying a minus sign, then the total area can be seen again as a sum, only this time some of the summands may be negative.
Example
Consider a function \(g(x)\) defined through its piecewise linear graph:
- Compute \(\int_{-3}^{-1} g(x) dx\). The area comprised of a square of area \(1\) and a triangle with area \(1/2\), so should be \(3/2\).
- Compute \(\int_{-3}^{0} g(x) dx\). In addition to the above, there is a triangle with area \(1/2\), but since the function is negative, this area is added in as \(-1/2\). In total then we have \(1 + 1/2 - 1/2 = 1\) for the answer.
- Compute \(\int_{-3}^{1} g(x) dx\):
We could add the signed area over \([0,1]\) to the above, but instead see a square of area \(1\), a triangle with area \(1/2\) and a triangle with signed area \(-1\). The total is then \(1/2\).
- Compute \(\int_{-3}^{3} g(x) dx\):
We could add the area, but let’s use a symmetry trick. This is clearly twice our second answer, or \(2\). (This is because \(g(x)\) is an even function, as we can tell from the graph.)
Example
Suppose \(f(x)\) is an odd function, then \(f(x) = - f(-x)\) for any \(x\). So the signed area between \([-a,0]\) is related to the signed area between \([0,a]\) but of different sign. This gives \(\int_{-a}^a f(x) dx = 0\) for odd functions.
An immediate consequence would be \(\int_{-\pi}^\pi \sin(x) = 0\), as would \(\int_{-a}^a x^k dx\) for any odd integer \(k > 0\).
Example
Numerically estimate the definite integral \(\int_0^e x\log(x) dx\). (We redefine the function to be \(0\) at \(0\), so it is continuous.)
We have to be a bit careful with the Riemann sum, as the left Riemann sum will have an issue at \(0=x_0\) (0*log(0)
) returns NaN
which will poison any subsequent arithmetic operations, so the value returned will be NaN
and not an approximate answer. We could define our function with a check:
h(x) = x > 0 ? x * log(x) : 0.0
h (generic function with 1 method)
This is actually inefficient, as the check for the size of x
will slow things down a bit. Since we will call this function 50,000 times, we would like to avoid this, if we can. In this case just using the right sum will work:
h(x) = x * log(x)
riemann(h, 0, 2, 50_000, method="right")
0.38632208884775826
(The default is "right"
, so no method specified would also work.)
Example
Let \(j(x) = \sqrt{1 - x^2}\). The area under the curve between \(-1\) and \(1\) is \(\pi/2\). Using a Riemann sum with 4 equal subintervals and the midpoint, estimate \(\pi\). How close are you?
The partition is \(-1 < -1/2 < 0 < 1/2 < 1\). The midpoints are \(-3/4, -1/4, 1/4, 3/4\). We thus have that \(\pi/2\) is approximately:
= range(-1, 1, length=5)
xs = diff(xs)
deltas = [-3/4, -1/4, 1/4, 3/4]
cs j(x) = sqrt(1 - x^2)
= sum(j(c)*delta for (c,delta) in zip(cs, deltas))
a pi/2 # π ≈ 2a a,
(1.629683664318002, 1.5707963267948966)
(For variety, we used an alternate way to sum over two vectors.)
So \(\pi\) is about 2a
.
Example
We have the well-known triangle inequality which says for an individual sum: \(\lvert a + b \rvert \leq \lvert a \rvert +\lvert b \rvert\). Applying this recursively to a partition with \(a < b\) gives:
\[\begin{multline*} \lvert f(c_1)(x_1-x_0) + f(c_2)(x_2-x_1) + \cdots + f(c_n) (x_n-x_1) \rvert\\ \leq \lvert f(c_1)(x_1-x_0) \rvert + \lvert f(c_2)(x_2-x_1)\rvert + \cdots +\lvert f(c_n) (x_n-x_{n-1}) \rvert \\ = \lvert f(c_1)\rvert (x_1-x_0) + \lvert f(c_2)\rvert (x_2-x_1)+ \cdots +\lvert f(c_n) \rvert(x_n-x_{n-1}). \end{multline*}\]
This suggests that the following inequality holds for integrals:
\(\lvert \int_a^b f(x) dx \rvert \leq \int_a^b \lvert f(x) \rvert dx\).
This can be used to give bounds on the size of an integral. For example, suppose you know that \(f(x)\) is continuous on \([a,b]\) and takes its maximum value of \(M\) and minimum value of \(m\). Letting \(K\) be the larger of \(\lvert M\rvert\) and \(\lvert m \rvert\), gives this bound when \(a < b\):
\[ \lvert\int_a^b f(x) dx \rvert \leq \int_a^b \lvert f(x) \rvert dx \leq \int_a^b K dx = K(b-a). \]
While such bounds are disappointing, often, when looking for specific values, they are very useful when establishing general truths, such as is done with proofs.
36.4 Error estimate
The Riemann sum above is actually extremely inefficient. To see how much, we can derive an estimate for the error in approximating the value using an arithmetic progression as the partition. Let’s assume that our function \(f(x)\) is increasing, so that the right sum gives an upper estimate and the left sum a lower estimate, so the error in the estimate will be between these two values:
\[ \begin{align*} \text{error} &\leq \left[ f(x_1) \cdot (x_{1} - x_0) + f(x_2) \cdot (x_{2} - x_1) + \cdots + f(x_{n-1})(x_{n-1} - x_{n-2}) + f(x_n) \cdot (x_n - x_{n-1})\right]\\ &- \left[f(x_0) \cdot (x_{1} - x_0) + f(x_1) \cdot (x_{2} - x_1) + \cdots + f(x_{n-1})(x_n - x_{n-1})\right] \\ &= \frac{b-a}{n} \cdot (\left[f(x_1) + f(x_2) + \cdots + f(x_n)\right] - \left[f(x_0) + \cdots + f(x_{n-1})\right]) \\ &= \frac{b-a}{n} \cdot (f(b) - f(a)). \end{align*} \]
We see the error goes to \(0\) at a rate of \(1/n\) with the constant depending on \(b-a\) and the function \(f\). In general, a similar bound holds when \(f\) is not monotonic.
There are other ways to approximate the integral that use fewer points in the partition. Simpson’s rule is one, where instead of approximating the area with rectangles that go through some \(c_i\) in \([x_{i-1}, x_i]\) instead the function is approximated by the quadratic polynomial going through \(x_{i-1}\), \((x_i + x_{i-1})/2\), and \(x_i\) and the exact area under that polynomial is used in the approximation. The explicit formula is:
\[ A \approx \frac{b-a}{3n} (f(x_0) + 4 f(x_1) + 2f(x_2) + 4f(x_3) + \cdots + 2f(x_{n-2}) + 4f(x_{n-1}) + f(x_n)). \]
The error in this approximation can be shown to be
\[ \text{error} \leq \frac{(b-a)^5}{180n^4} \text{max}_{\xi \text{ in } [a,b]} \lvert f^{(4)}(\xi) \rvert. \]
That is, the error is like \(1/n^4\) with constants depending on the length of the interval, \((b-a)^5\), and the maximum value of the fourth derivative over \([a,b]\). This is significant, the error in \(10\) steps of Simpson’s rule is on the scale of the error of \(10,000\) steps of the Riemann sum for well-behaved functions.
The Wikipedia article mentions that Kepler used a similar formula \(100\) years prior to Simpson, or about \(200\) years before Riemann published his work. Again, the value in Riemann’s work is not the computation of the answer, but the framework it provides in determining if a function is Riemann integrable or not.
36.5 Gauss quadrature
The formula for Simpson’s rule was the composite formula. If just a single rectangle is approximated over \([a,b]\) by a parabola interpolating the points \(x_1=a\), \(x_2=(a+b)/2\), and \(x_3=b\), the formula is:
\[ \frac{b-a}{6}(f(x_1) + 4f(x_2) + f(x_3)). \]
This formula will actually be exact for any 3rd degree polynomial. In fact an entire family of similar approximations using \(n\) points can be made exact for any polynomial of degree \(n-1\) or lower. But with non-evenly spaced points, even better results can be found.
The formulas for an approximation to the integral \(\int_{-1}^1 f(x) dx\) discussed so far can be written as:
\[ \begin{align*} S &= f(x_1) \Delta_1 + f(x_2) \Delta_2 + \cdots + f(x_n) \Delta_n\\ &= w_1 f(x_1) + w_2 f(x_2) + \cdots + w_n f(x_n). \end{align*} \]
The \(w\)s are “weights” and the \(x\)s are nodes. A Gaussian quadrature rule is a set of weights and nodes for \(i=1, \dots n\) for which the sum is exact for any \(f\) which is a polynomial of degree \(2n-1\) or less. Such choices then also approximate well the integrals of functions which are not polynomials of degree \(2n-1\), provided \(f\) can be well approximated by a polynomial over \([-1,1]\). (Which is the case for the “nice” functions we encounter.) Some examples are given in the questions.
36.5.1 The quadgk function
In Julia
a modification of the Gauss quadrature rule is implemented in the quadgk
function (from the QuadGK
package) to give numeric approximations to integrals. The quadgk
function also has the familiar interface action(function, arguments...)
. Unlike our riemann
function, there is no n
specified, as the number of steps is adaptively determined. (There is more partitioning occurring where the function is changing rapidly.) Instead, the algorithm outputs an estimate on the possible error along with the answer. Instead of \(n\), some trickier problems require a specification of an error threshold.
To use the function, we have:
f(x) = x * log(x)
quadgk(f, 0, 2)
(0.38629436103070175, 4.856575112145755e-9)
As mentioned, there are two values returned: an approximate answer, and an error estimate. In this example we see that the value of \(0.3862943610307017\) is accurate to within \(10^{-9}\). (The actual answer is \(-1 + 2\cdot \log(2)\) and the error is only \(10^{-11}\). The reported error is an upper bound, and may be conservative, as with this problem.) Our previous answer using \(50,000\) right-Riemann sums was \(0.38632208884775737\) and is only accurate to \(10^{-5}\). By contrast, this method uses just \(256\) function evaluations in the above problem.
The method should be exact for polynomial functions:
f(x) = x^5 - x + 1
quadgk(f, -2, 2)
(3.9999999999999973, 1.3322676295501878e-15)
The error term is \(0\), answer is \(4\) up to the last unit of precision (1 ulp), so any error is only in floating point approximations.
For the numeric computation of definite integrals, the quadgk
function should be used over the Riemann sums or even Simpson’s rule.
Here are some sample integrals computed with quadgk
:
\[ \int_0^\pi \sin(x) dx \]
quadgk(sin, 0, pi)
(2.0000000000000004, 1.7901236049056024e-12)
(Again, the actual answer is off only in the last digit, the error estimate is an upper bound.)
\[ \int_0^2 x^x dx \]
u(x) = x^x
quadgk(u, 0, 2)
(2.8338767448900546, 1.948175154531384e-8)
\[ \int_0^5 e^x dx \]
quadgk(exp, 0, 5)
(147.41315910257657, 2.6594506152832764e-8)
When composing the answer with other functions it may be desirable to drop the error in the answer. Two styles can be used for this. The first is to just name the two returned values:
= quadgk(cos, 0, pi/4)
A, err A
0.7071067811865475
The second is to ask for just the first component of the returned value:
= quadgk(tan, 0, pi/4)[1] # or first(quadgk(tan, 0, pi/4)) A
0.3465735902799726
To visualize the choice of nodes by the algorithm, we have for \(f(x)=\sin(x)\) over \([0,\pi]\) relatively few nodes used to get a high-precision estimate:
For a more oscillatory function, more nodes are chosen:
Example
In probability theory, a univariate density is a function, \(f(x)\) such that \(f(x) \geq 0\) and \(\int_a^b f(x) dx = 1\), where \(a\) and \(b\) are the range of the distribution. The Von Mises distribution, takes the form
\[ k(x) = C \cdot \exp(\cos(x)), \quad -\pi \leq x \leq \pi \]
Compute \(C\) (numerically).
The fact that \(1 = \int_{-\pi}^\pi C \cdot \exp(\cos(x)) dx = C \int_{-\pi}^\pi \exp(\cos(x)) dx\) implies that \(C\) is the reciprocal of
k(x) = exp(cos(x))
= quadgk(k, -pi, pi) A,err
(7.954926521012919, 3.9023298370466364e-8)
So
= 1/A
C k₁(x) = C * exp(cos(x))
k₁ (generic function with 1 method)
The cumulative distribution function for \(k(x)\) is \(K(x) = \int_{-\pi}^x k(u) du\), \(-\pi \leq x \leq \pi\). We just showed that \(K(\pi) = 1\) and it is trivial that \(K(-\pi) = 0\). The quantiles of the distribution are the values \(q_1\), \(q_2\), and \(q_3\) for which \(K(q_i) = i/4\). Can we find these?
First we define a function, that computes \(K(x)\):
K(x) = quadgk(k₁, -pi, x)[1]
K (generic function with 1 method)
(The trailing [1]
is so only the answer - and not the error - is returned.)
The question asks us to solve \(K(x) = 0.25\), \(K(x) = 0.5\) and \(K(x) = 0.75\). The Roots
package can be used for such work, in particular find_zero
. We will use a bracketing method, as clearly \(K(x)\) is increasing, as \(k(u)\) is positive, so we can just bracket our answer with \(-\pi\) and \(\pi\). (We solve \(K(x) - p = 0\), so \(K(\pi) - p > 0\) and \(K(-\pi)-p < 0\).). We could do this with [find_zero(x -> K(x) - p, (-pi, pi)) for p in [0.25, 0.5, 0.75]]
, but that is a bit less performant than using the solve
interface for this task:
= ZeroProblem((x,p) -> K(x) - p, (-pi, pi))
Z solve.(Z, (1/4, 1/2, 3/4))
(-0.8097673745015915, 0.0, 0.8097673745016285)
The middle one is clearly \(0\). This distribution is symmetric about \(0\), so half the area is to the right of \(0\) and half to the left, so clearly when \(p=0.5\), \(x\) is \(0\). The other two show that the area to the left of \(-0.809767\) is equal to the area to the right of \(0.809767\) and equal to \(0.25\).
Example: Gauss nodes
The QuadGK.gauss(n)
function returns a pair of \(n\) quadrature points and weights to integrate a function over the interval \((-1,1)\), with an option to use a different interval \((a,b)\). For a given \(n\), these values exactly integrate any polynomial of degree \(2n-1\) or less. The pattern to integrate below can be expressed in other ways, but this is intended to be direct:
= QuadGK.gauss(5) xs, ws
([-0.906179845938664, -0.5384693101056831, 0.0, 0.5384693101056831, 0.906179845938664], [0.2369268850561892, 0.4786286704993663, 0.5688888888888887, 0.4786286704993663, 0.2369268850561892])
f(x) = exp(cos(x))
sum(w * f(x) for (x, w) in zip(xs, ws))
4.68315881456892
The zip
function is used to iterate over the xs
and ws
as pairs of values.
36.6 Questions
Question
Using geometry, compute the definite integral:
\[ \int_{-5}^5 \sqrt{5^2 - x^2} dx. \]
Question
Using geometry, compute the definite integral:
\[ \int_{-2}^2 (2 - \lvert x\rvert) dx \]
Question
Using geometry, compute the definite integral:
\[ \int_0^3 3 dx + \int_3^9 (3 + 3(x-3)) dx \]
Question
Using geometry, compute the definite integral:
\[ \int_0^5 \lfloor x \rfloor dx \]
(The notation \(\lfloor x \rfloor\) is the integer such that \(\lfloor x \rfloor \leq x < \lfloor x \rfloor + 1\).)
Question
Using geometry, compute the definite integral between \(-3\) and \(3\) of this graph comprised of lines and circular arcs:
The value is:
Question
For the function \(f(x) = \sin(\pi x)\), estimate the integral for \(-1\) to \(1\) using a left-Riemann sum with the partition \(-1 < -1/2 < 0 < 1/2 < 1\).
Question
Without doing any real work, find this integral:
\[ \int_{-\pi/4}^{\pi/4} \tan(x) dx. \]
Question
Without doing any real work, find this integral:
\[ \int_3^5 (1 - \lvert x-4 \rvert) dx \]
Question
Suppose you know that for the integrable function \(\int_a^b f(u)du =1\) and \(\int_a^c f(u)du = p\). If \(a < c < b\) what is \(\int_c^b f(u)du\)?
Question
What is \(\int_0^2 x^4 dx\)? Use the rule for integrating \(x^n\).
Question
Solve for a value of \(x\) for which:
\[ \int_1^x \frac{1}{u}du = 1. \]
Question
Solve for a value of \(n\) for which
\[ \int_0^1 x^n dx = \frac{1}{12}. \]
Question
Suppose \(f(x) > 0\) and \(a < c < b\). Define \(F(x) = \int_a^x f(u) du\). What can be said about \(F(b)\) and \(F(c)\)?
Question
For the right Riemann sum approximating \(\int_0^{10} e^x dx\) with \(n=100\) subintervals, what would be a good estimate for the error?
Question
Use quadgk
to find the following definite integral:
\[ \int_1^4 x^x dx . \]
Question
Use quadgk
to find the following definite integral:
\[ \int_0^3 e^{-x^2} dx . \]
Question
Use quadgk
to find the following definite integral:
\[ \int_0^{9/10} \tan(u \frac{\pi}{2}) du. \]
Question
Use quadgk
to find the following definite integral:
\[ \int_{-1/2}^{1/2} \frac{1}{\sqrt{1 - x^2}} dx \]
Question
The interactive graphic shows the area of a right-Riemann sum for different partitions. The function is
\[ f(x) = \frac{1}{\sqrt{ x^4 + 10x^2 - 60x + 100}} \]
When \(n=5\) what is the area of the Riemann sum?
When \(n=50\) what is the area of the Riemann sum?
Using quadgk
what is the area under the curve?
Question
Gauss nodes for approximating the integral \(\int_{-1}^1 f(x) dx\) for \(n=4\) are:
= [-0.861136, -0.339981, 0.339981, 0.861136] ns
4-element Vector{Float64}:
-0.861136
-0.339981
0.339981
0.861136
The corresponding weights are
= [0.347855, 0.652145, 0.652145, 0.347855] wts
4-element Vector{Float64}:
0.347855
0.652145
0.652145
0.347855
Use these to estimate the integral \(\int_{-1}^1 \cos(\pi/2 \cdot x)dx\) with \(w_1f(x_1) + w_2 f(x_2) + w_3 f(x_3) + w_4 f(x_4)\).
The actual answer is \(4/\pi\). How far off is the approximation based on 4 points?
Question
Using the Gauss nodes and weights from the previous question, estimate the integral of \(f(x) = e^x\) over \([-1, 1]\). The value is: