using CalculusWithJulia
using Plots
plotly()
using QuadGK
using Roots
36 Area under a curve
This section uses these add-on packages:
The question of area has long fascinated human culture. As children, we learn early on the formulas for the areas of some geometric figures: a square is
The modern approach to area begins with a non-negative function
For some functions, this area can be computed by geometry, for example, here we see the area under
f(x) = 1 - abs(x)
plot(f, -1, 1)
plot!(zero)
Similarly, we know this area is also
f(x) = 1
plot(f, 0, 1)
plot!(zero)
This one, is simply
f(x) = sqrt(1 - x^2)
plot(f, -1, 1)
plot!(zero)
And this area can be broken into a sum of the area of square and the area of a triangle, or
f(x) = x > 1 ? 2 - x : 1.0
plot(f, 0, 2)
plot!(zero)
But what of more complicated areas? Can these have their area computed?
36.1 Approximating areas
In a previous section, we saw this animation:
The first triangle has area
This illustrates a method of Archimedes to compute the area contained in a parabola using the method of exhaustion. Archimedes leveraged a fact he discovered relating the areas of triangle inscribed with parabolic segments to create a sum that could be computed.
The pursuit of computing areas persisted. The method of computing area by finding a square with an equivalent area was known as quadrature. Over the years, many figures had their area computed, for example, the area under the graph of the cycloid (…Galileo tried empirically to find this using a tracing on sheet metal and a scale).
However, as areas of geometric objects were replaced by the more general question of area related to graphs of functions, a more general study was called for.
One such approach is illustrated in this figure due to Beeckman from 1618 (from Bressoud)
Beeckman actually did more than find the area. He generalized the relationship of rate
Adding up the smaller “squares” can be a bit more efficient if we were to add all those in a row, or column at once. We would then add the areas of a smaller number of rectangles. For this curve, the two approaches are basically identical. For other curves, identifying which squares in a row would be added is much more complicated (though useful), but for a curve generated by a function, identifying which “squares” go in a rectangle is quite easy, in fact we can see the rectangle’s area will be a base given by that of the squares, and height depending on the function.
36.1.1 Adding rectangles
The idea of the Riemann sum then is to approximate the area under the curve by the area of well-chosen rectangles in such a way that as the bases of the rectangles get smaller (hence adding more rectangles) the error in approximation vanishes.
Define a partition of
Clearly for a given partition and choice of
Illustration of left Riemann sum for increasing
To successfully compute a good approximation for the area, we would need to choose
For Archimedes’ problem - finding the area under
The latter uses a well-known formula for the sum of squares of the first
With this expression, it is readily seen that as
The above approach, like Archimedes’, ends with a limit being taken. The answer comes from using a limit to add a big number of small values. As with all limit questions, worrying about whether a limit exists is fundamental. For this problem, we will see that for the general statement there is a stretching of the formal concept of a limit.
There is a more compact notation to
The notation includes three pieces of information:
- The
is an indication of a sum - The
and sub- and superscripts indicate the range to sum over. - The term
is a general term describing the th entry, where it is understood that is just some arbitrary indexing value.
With this notation, a Riemann sum can be written as
36.1.2 Other sums
The choice of the
- Using the right hand endpoint of the interval
giving the right-Riemann sum, . - The choice
gives the left-Riemann sum, . - The choice
is the midpoint rule, . - If the function is continuous on the closed subinterval
, then it will take on its minimum and maximum values. By the extreme value theorem, we could take to correspond to either the maximum or the minimum. These choices give the “upper Riemann-sums” and “lower Riemann-sums”.
The choice of partition can also give different answers. A common choice is to break the interval into range(a, b, length=n+1)
command will compute these.) An alternate choice made below for one problem is to use a geometric progression:
The general statement allows for any partition such that the largest gap goes to
Riemann sums weren’t named after Riemann because he was the first to approximate areas using rectangles. Indeed, others had been using even more efficient ways to compute areas for centuries prior to Riemann’s work. Rather, Riemann put the definition of the area under the curve on a firm theoretical footing with the following theorem which gives a concrete notion of what functions are integrable:
The expression
36.1.3 Some immediate consequences
The following formulas are consequences when
The area is
Even our definition of a partition doesn’t really apply, as we assume
The area under a constant function is found from the area of rectangle, a special case being
For any partition of
Scaling the
Let
The area is invariant under shifts left or right.
Any partition
The left side will have a limit of
Similarly, reflections don’t effect the area under the curve, they just require a new parameterization:
The scaling operation
The scaling operation shifts
Combining two operations above, the operation
The area between
For this, suppose we have a partition for both the integrals on the right hand side for a given
This is due to the area on the left and right of
The “reversed” area is the same, only accounted for with a minus sign.
A consequence of the last few statements is:
If
is an even function, then . If is an odd function, then .
If
If
then
For any partition of
36.1.4 Some known integrals
Using the definition, we can compute a few definite integrals:
This is just the area of a trapezoid with heights
This is similar to the Archimedes case with
.
Cauchy showed this using a geometric series for the partition, not the arithmetic series
Again, Cauchy showed this using a geometric series. The expression
But, letting
(Using L’Hopital’s rule to compute the limit.)
Certainly other integrals could be computed with various tricks, but we won’t pursue this. There is another way to evaluate integrals using the forthcoming Fundamental Theorem of Calculus.
36.1.5 Some other consequences
The definition is defined in terms of any partition with its norm bounded by
. If you know a function is Riemann integrable, then it is enough to consider just a regular partition when forming the sums, as was done above. It is just that showing a limit for just this particular type of partition would not be sufficient to prove Riemann integrability.The choice of
is arbitrary to allow for maximum flexibility. The Darboux integrals use the maximum and minimum over the subinterval. It is sufficient to prove integrability to show that the limit exists with just these choices.Most importantly,
A continuous function on
is Riemann integrable on .
The main idea behind this is that the difference between the maximum and minimum values over a partition gets small. That is if
- A “jump”, or discontinuity of the first kind, is a value
in where and both exist, but are not equal. It is true that a function that is not continuous on , but only has discontinuities of the first kind on will be Riemann integrable on .
For example, the function
- Some functions can have infinitely many points of discontinuity and still be integrable. The example of
when is rational, and otherwise is often used as an example.
36.2 Numeric integration
The Riemann sum approach gives a method to approximate the value of a definite integral. We just compute an approximating sum for a large value of
To see the mechanics, let’s again return to Archimedes’ problem and compute
Let us fix some values:
a, b = 0, 1
f(x) = x^2
f (generic function with 1 method)
Then for a given
n = 5
xs = a:(b-a)/n:b # also range(a, b, length=n)
deltas = diff(xs) # forms x2-x1, x3-x2, ..., xn-xn-1
cs = xs[1:end-1] # finds left-hand end points. xs[2:end] would be right-hand ones.
0.0:0.2:0.8
Now to multiply the values. We want to sum the product f(cs[i]) * deltas[i]
, here is one way to do so:
sum(f(cs[i]) * deltas[i] for i in 1:length(deltas))
0.24000000000000002
Our answer is not so close to the value of
n = 50_000
xs = a:(b-a)/n:b
deltas = diff(xs)
cs = xs[1:end-1]
sum(f(cs[i]) * deltas[i] for i in 1:length(deltas))
0.3333233333999998
This value is about
We should expect that larger values of
Before continuing, we define a function to compute Riemann sums for us with an extra argument to specifying one of four methods for computing
riemann(f, a, b, n; method="right") = riemann(f, range(a,b,n+1); method=method)
function riemann(f, xs; method="right")
Ms = (left = (f,a,b) -> f(a),
right= (f,a,b) -> f(b),
trapezoid = (f,a,b) -> (f(a) + f(b))/2,
simpsons = (f,a,b) -> (c = a/2 + b/2; (1/6) * (f(a) + 4*f(c) + f(b))),
)
_riemann(Ms[Symbol(method)], f, xs)
end
function _riemann(M, f, xs)
xs′ = zip(xs[1:end-1], xs[2:end])
sum(M(f, a, b) * (b-a) for (a,b) ∈ xs′)
end
(This function is defined in CalculusWithJulia
and need not be copied over if that package is loaded.)
With this, we can easily find an approximate answer. We wrote the function to use the familiar template action(function, arguments...)
, so we pass in a function and arguments to describe the problem (a
, b
, and n
and, optionally, the method
):
f(x) = exp(x)
riemann(f, 0, 5, 10) # S_10
187.324835773627
Or with more intervals in the partition
riemann(f, 0, 5, 50_000)
147.42052988337647
(The answer is
36.3 “Negative” area
So far, we have had the assumption that
The right hand side is defined whenever the Riemann limit exists and in that case we call
Suppose
If we think of the area below the
Example
Consider a function
- Compute
. The area comprised of a square of area and a triangle with area , so should be . - Compute
. In addition to the above, there is a triangle with area , but since the function is negative, this area is added in as . In total then we have for the answer. - Compute
:
We could add the signed area over
- Compute
:
We could add the area, but let’s use a symmetry trick. This is clearly twice our second answer, or
Example
Suppose
An immediate consequence would be
Example
Numerically estimate the definite integral
We have to be a bit careful with the Riemann sum, as the left Riemann sum will have an issue at 0*log(0)
) returns NaN
which will poison any subsequent arithmetic operations, so the value returned will be NaN
and not an approximate answer. We could define our function with a check:
h(x) = x > 0 ? x * log(x) : 0.0
h (generic function with 1 method)
This is actually inefficient, as the check for the size of x
will slow things down a bit. Since we will call this function 50,000 times, we would like to avoid this, if we can. In this case just using the right sum will work:
h(x) = x * log(x)
riemann(h, 0, 2, 50_000, method="right")
0.38632208884775826
(The default is "right"
, so no method specified would also work.)
Example
Let
The partition is
xs = range(-1, 1, length=5)
deltas = diff(xs)
cs = [-3/4, -1/4, 1/4, 3/4]
j(x) = sqrt(1 - x^2)
a = sum(j(c)*delta for (c,delta) in zip(cs, deltas))
a, pi/2 # π ≈ 2a
(1.629683664318002, 1.5707963267948966)
(For variety, we used an alternate way to sum over two vectors.)
So 2a
.
Example
We have the well-known triangle inequality which says for an individual sum:
This suggests that the following inequality holds for integrals:
.
This can be used to give bounds on the size of an integral. For example, suppose you know that
While such bounds are disappointing, often, when looking for specific values, they are very useful when establishing general truths, such as is done with proofs.
36.4 Error estimate
The Riemann sum above is actually extremely inefficient. To see how much, we can derive an estimate for the error in approximating the value using an arithmetic progression as the partition. Let’s assume that our function
We see the error goes to
There are other ways to approximate the integral that use fewer points in the partition. Simpson’s rule is one, where instead of approximating the area with rectangles that go through some
The error in this approximation can be shown to be
That is, the error is like
The Wikipedia article mentions that Kepler used a similar formula
36.5 Gauss quadrature
The formula for Simpson’s rule was the composite formula. If just a single rectangle is approximated over
This formula will actually be exact for any 3rd degree polynomial. In fact an entire family of similar approximations using
The formulas for an approximation to the integral
The
36.5.1 The quadgk function
In Julia
a modification of the Gauss quadrature rule is implemented in the quadgk
function (from the QuadGK
package) to give numeric approximations to integrals. The quadgk
function also has the familiar interface action(function, arguments...)
. Unlike our riemann
function, there is no n
specified, as the number of steps is adaptively determined. (There is more partitioning occurring where the function is changing rapidly.) Instead, the algorithm outputs an estimate on the possible error along with the answer. Instead of
To use the function, we have:
f(x) = x * log(x)
quadgk(f, 0, 2)
(0.38629436103070175, 4.856575112145755e-9)
As mentioned, there are two values returned: an approximate answer, and an error estimate. In this example we see that the value of
The method should be exact for polynomial functions:
f(x) = x^5 - x + 1
quadgk(f, -2, 2)
(3.9999999999999973, 1.3322676295501878e-15)
The error term is
For the numeric computation of definite integrals, the quadgk
function should be used over the Riemann sums or even Simpson’s rule.
Here are some sample integrals computed with quadgk
:
quadgk(sin, 0, pi)
(2.0000000000000004, 1.7901236049056024e-12)
(Again, the actual answer is off only in the last digit, the error estimate is an upper bound.)
u(x) = x^x
quadgk(u, 0, 2)
(2.8338767448900546, 1.948175154531384e-8)
quadgk(exp, 0, 5)
(147.41315910257657, 2.6594506152832764e-8)
When composing the answer with other functions it may be desirable to drop the error in the answer. Two styles can be used for this. The first is to just name the two returned values:
A, err = quadgk(cos, 0, pi/4)
A
0.7071067811865475
The second is to ask for just the first component of the returned value:
A = quadgk(tan, 0, pi/4)[1] # or first(quadgk(tan, 0, pi/4))
0.3465735902799726
To visualize the choice of nodes by the algorithm, we have for
For a more oscillatory function, more nodes are chosen:
Example
In probability theory, a univariate density is a function,
Compute
The fact that
k(x) = exp(cos(x))
A,err = quadgk(k, -pi, pi)
(7.954926521012919, 3.9023298370466364e-8)
So
C = 1/A
k₁(x) = C * exp(cos(x))
k₁ (generic function with 1 method)
The cumulative distribution function for
First we define a function, that computes
K(x) = quadgk(k₁, -pi, x)[1]
K (generic function with 1 method)
(The trailing [1]
is so only the answer - and not the error - is returned.)
The question asks us to solve Roots
package can be used for such work, in particular find_zero
. We will use a bracketing method, as clearly [find_zero(x -> K(x) - p, (-pi, pi)) for p in [0.25, 0.5, 0.75]]
, but that is a bit less performant than using the solve
interface for this task:
Z = ZeroProblem((x,p) -> K(x) - p, (-pi, pi))
solve.(Z, (1/4, 1/2, 3/4))
(-0.8097673745015915, 0.0, 0.8097673745016285)
The middle one is clearly
Example: Gauss nodes
The QuadGK.gauss(n)
function returns a pair of
xs, ws = QuadGK.gauss(5)
([-0.906179845938664, -0.5384693101056831, 0.0, 0.5384693101056831, 0.906179845938664], [0.2369268850561892, 0.4786286704993663, 0.5688888888888887, 0.4786286704993663, 0.2369268850561892])
f(x) = exp(cos(x))
sum(w * f(x) for (x, w) in zip(xs, ws))
4.68315881456892
The zip
function is used to iterate over the xs
and ws
as pairs of values.
36.6 Questions
Question
Using geometry, compute the definite integral:
Question
Using geometry, compute the definite integral:
Question
Using geometry, compute the definite integral:
Question
Using geometry, compute the definite integral:
(The notation
Question
Using geometry, compute the definite integral between
The value is:
Question
For the function
Question
Without doing any real work, find this integral:
Question
Without doing any real work, find this integral:
Question
Suppose you know that for the integrable function
Question
What is
Question
Solve for a value of
Question
Solve for a value of
Question
Suppose
Question
For the right Riemann sum approximating
Question
Use quadgk
to find the following definite integral:
Question
Use quadgk
to find the following definite integral:
Question
Use quadgk
to find the following definite integral:
Question
Use quadgk
to find the following definite integral:
Question
The interactive graphic shows the area of a right-Riemann sum for different partitions. The function is
When
When
Using quadgk
what is the area under the curve?
Question
Gauss nodes for approximating the integral
ns = [-0.861136, -0.339981, 0.339981, 0.861136]
4-element Vector{Float64}:
-0.861136
-0.339981
0.339981
0.861136
The corresponding weights are
wts = [0.347855, 0.652145, 0.652145, 0.347855]
4-element Vector{Float64}:
0.347855
0.652145
0.652145
0.347855
Use these to estimate the integral
The actual answer is
Question
Using the Gauss nodes and weights from the previous question, estimate the integral of