Basic Combinatorics for Programmers
Posted on October 29, 2007
Filed Under Computer Science, Math |
Man only likes to count his troubles, but he does not count his joys.
~Fyodor Dostoevsky
What is Combinatorics?
Combinatorics is the math behind counting. All problems that start with the phrase “How many ways..” are most likely combinatorics problem.
Who is this page for?
- Programmers who have a simple background in Mathematics, or need a quick reference for how to implement combinatorial algorithms.
- Those who did not know what the word “combinatorics” meant.
Factorials
Factorials are used to solve the problem, “How many ways can I arrange n objects?” It makes the following assumptions:
- Each object is distinguishable (Can I tell apart the n apples?)
- You can actually distinguish arrangements of objects (Can I line up n apples in a meaningful way?)
- The elements are non-replaceable (Once I use an apple, I can’t use an apple again).
The factorial function is defined as follows for positive integers:
Factorial(1) = 1
Factorial(n) = n * Factorial(n-1)
or, the naieve C implementation:
// For 32 bit systems, will not return meaningful results // for values larger than 12! // // For 64-bit systems, will not return meaningful results // for values larger than 20! unsigned int factorial(unsigned int n) { unsigned int ret = 1; for(int i=2; i<=n; ++i) ret*=i; return ret; }
The C implementation above is fine for the values it can logically compute. However, when we get into arbitrary precision integer calculations, calculating 1 * 2 * 3 * … * n is inefficient! The number of words produced at the end of each step grows linearly. If we divide-and-conquer, we keep down the number of words-per-integer until the very last few multiplications, which produces significant time savings.
Of course, we would NEVER use this in the real world without extra sanity checking of the input. The importance of sanity checks in the real world is demonstrated every day by celebrities. Don’t commit their same blunders.
mpz_class factorial(mpz_class begin, mpz_class end) { if(end==begin) return end; if(end-begin == 1) return begin*end; int half = (begin+end)/2; return factorial(begin, half) * factorial(half+1, end); }
In fact, running the two against each other to find 100,000! produces the following runtimes on my laptop (I was going to do 1 million and got bored waiting for the iterative solution to finish!):
g++ factorial.cpp -O2 -lgmp -lgmpxx ./a.out Recursive factorial: 0.2795 seconds Iterative factorial: 3.01006 seconds
The most efficient ways to implement factorial algorithms turns out to be from its prime factorization, but the above is far easier than implementing sieving methods to find primes! Maybe another day
Permutations
Permutations are how we solve the problem, “how many different ways can I arrange k objects from a list of n objects? To give an example, assuming the same number could not be used twice in a combination, your “combination” lock from high school was actually a permutation lock. You were given a single permutation that you could use in order to open it. But how many different possible permutations were there, given 45 as a max value and 3 numbers?
- The first number has no restrictions. You can choose from all 45 numbers
- The second number is restricted: it can’t be the same as the first number. You now have 44 to choose from, giving you 45 * 44 different possible 2-permutations of 45
- The third number is also restricted: You can only choose 43 numbers. 45 * 44 * 43 is the number of 3-permutations of 45.
It’s not hard to convince ourselves that the permutation is all of the numbers between and including 45 and 43 multiplied together. Symbolically, we are talking about all of the numbers between and including n and (n - k + 1), inclusive.
The formula we all learned in school (which you can convince yourself is equivalent to the above) is as follows:
// Picking and ordering k objects from n objects mpz_class permutation(mpz_class n, mpz_class k) { return factorial(1, n) / factorial(1, k); }
Never do this! You have to recompute many of the values that you never use, and it is worthless and computationally wasteful to do so. Instead, you can use the following:
mpz_class permutation(mpz_class n, mpz_class k) { if(k == 0) return 1; return factorial(n - k + 1, n); }
It’s much simpler. It computes much less. Use it. It is calling your name. Use it.
Combinations
Permutations are very closely related to combinations. In fact, there is only a one word difference in the definition. Combinations are unordered, and permutations are ordered. We can derive the formula by what this implies.
Let’s take our combination lock example. We needed to get three numbers in a row, in order. But what happens if we only need to select the three numbers without concern to the order? We know from the factorial section that the number of ways to order 3 numbers is 3 factorial. So let’s say we have the permutations { (1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1) }. These are all the same combination, because they all pick the same exact elements! So there is clearly 1/6 the number of combinations as permutations. 6 is 3!, which is no coincidence. In general, the formula for combinations is:
mpz_class combination(mpz_class n, mpz_class k) { if(k == 0) return 1; return permutation(n, k)/factorial(1, k); }
*Hits the Easy Button*
Next week: More complicated number sequences such as the Binomial Coefficients and the Sterling Numbers
Popularity: 29% [?]
Comments
3 Responses to “Basic Combinatorics for Programmers”
Leave a Reply

What the heck is “mpz_class”?
Your definition of factorial omits factorial(0)=1
Fantastic! Thankyou. It was a quick and clear description of the subtle differences I was looking for.
I would like to recommend this to improve the blog further:
Q) when to use x^n; (when repeatition is possible?)
example: a coin tossed n times.
Including some text on this would make the blog complete in my opinion! Nevertheless, great blog! Thanks!
Manav