O Notation via Calculus
Posted on April 21, 2008
Filed Under Computer Science |
Calculus Via O Notation
In the article “Calculus via O Notation,” Alexandre Borovik transcribes a letter written by Donald Knuth where Knuth describes the idea of calculus in terms of
notation. He shows that not only is it possible to use
notation in integration and derivation, but that the notation also gives you a built-in error estimator.
While the notation is unfamiliar at first, it has a few added advantages. For one, the
term of the equation describes the estimated error of your calculation, which is great for programmers.
O Notation via Calculus
I learned the opposite relationship between calculus and
notation: that you can use calculus to derive the
estimate of an algorithm!
The method uses integration, and the method is easier to demonstrate using a simple example than to explain outright:
Contrived Example: Let’s say that we want to find the algorithmic complexity (as measured by multiplications) in the following loop:
int a = 0; for(int i=0; i<n; ++i) for(int j = 0; j<n; ++j) for(int k = 0; k<n; ++k) a*=2;
We can trivially rewrite the sum of the loop operations with sigma notation:
Therefore, the complexity of the loop is
.
We can also, more informally, treat this as the definite integral with respect to n. If we look at the inner loop, since we are performing 1 operation n times, our output will be linear. If we have a linear output on the second loop, the output will be
. If we can generalize the summation for any n, we can calculate the integral from 1 to n and determine the order by which the function grows.
(Note: For simplicity, I am dropping lower-order terms, since they obviously do not factor into the answer.)
Why Does This Work?
We are looking at the order of the growth of the area under the curve. This order will be the same order that the algorithm grows, as this is based on the same argument by which the rectangle rule works — the number of calculations for each n is a rectangle approximation of a continuous function and grows at the same rate for larger n.
It is important to note that this does not give you an exact computation of the number of operations in your algorithm. Instead, this gives you the order of growth of the algorithm with problem size n.
Unfortunately, I can only give an informal argument for why this should work. The professor who introduced this technique to me did it in an offhand remark, so I’m not sure whether this idea is applicable in all situations. If you have any remarks with a solid mathematical basis, I’d love to hear them.
Better Example: Mergesort
We all know that the number of comparisons in Mergesort grows
in the worst case. Let’s use the integration method to double-check this.
Small Example: An array of 16 elements
First, we need to break up the problem in terms of a summation of n. Let’s pick the number of comparisons of arrays of size {1, 2, 4, 8, …} while Mergesorting an array of 16 elements.
First, we have all of the comparisons of arrays of size 1. We will be performing 8 different sets of comparisons, and we will need 1 comparison in each set.
Total comparisons: 8 * 1
Second, all of the comparisons of arrays of size 2. We will be performing 4 different sets of comparisons, and we will need 3 comparisons per set.
Total comparisons: 4 * 3
Third, all of the comparisons of arrays of size 4. We will be performing 2 different sets of comparisons, and we will need 7 comparisons per set.
Total comparisons: 2 * 7
Lastly, all of the comparisons of arrays of size 8. We have only one set of comparisons, and we will need 15 comparisons per set.
Total comparisons: 1 * 15
Second, now that we have the different sets, we need to look for a formula that will give us the right number of comparisons for each stage x.
Stage 1: 8 * 1 = 2^3 * (2^1 - 1)
Stage 2: 4 * 3 = 2^2 * (2^2 - 1)
Stage 3: 2 * 7 = 2^1 * (2^3 - 1)
Stage 4: 1 * 15 = 2^0 * (2^4 - 1)
Our formula for each stage x is therefore:

Third, we integrate our formula from 1 to n.
Our integral will be:
Plugging this quickly into Mathematica, we get the answer back:
which, after working through the Logjam, is found to be
.
When is This Useful?
For some people, never: recurrence formulas and basic combinatorics is all they will ever need to solve recurrences. However, some students (mathematicians especially) may be more comfortable dealing with calculus than recurrence formulas.
It never hurts to have another tool in your kit.
Popularity: 49% [?]
Comments
8 Responses to “O Notation via Calculus”
Leave a Reply












Amazing article!
This is the approach that I use. I actually find it more intuitive for nested loops, for example, to think in terms of nested sums and series expansions rather than recurrence relations.
Very interesting article, but if your integral ends up so complicated that you need to use a CAS system to evaluate it, you may as well have used it to evaluate the original sums, no?
The integral does not really need a CAS. It can be simplified to
n \int_1^{\log n} (1 - 2^{-x}) dx
= n \int_1^{\log n} O(1) dx
= n O(log n - 1)
= O(n log n)
I usually do them by hand, as I’m a big fan of mental math and understanding problems. For the purposes of the blog post I wanted to finish the post quickly without plaguing my readers with silly errors. Mathematica was the quickest way to do that
Your conversion to integrals in the contrived example is wrong. It should be O(int 0..n (int 0..n (int 0..n (1) dk) dj) di). That integral simplifies directly to n^3.
@ Hughes:
You’re right. Also, the integral as written isn’t really in the correct notation (dn dn dn is wrong).
The n^3 / 6 result in the post is correct for this problem–
for i = 1:n
for j = 1:i
for k = 1:j
a *= 2
Unfortunately, it appears as though my calc skills have rusted with misuse. Fortunately for me, my mistakes were largely notational, and the main ideas of the article stand up to the test. Thank you for your corrections, I’ll be sure to make the changes in the upcoming days.