Basic Combinatorics for Programmers

Posted on October 29, 2007
Filed Under Computer Science, Math |

Man only likes to count his troubles, but he does not count his joys.
~Fyodor Dostoevsky

What is Combinatorics?

Combinatorics is the math behind counting. All problems that start with the phrase “How many ways..” are most likely combinatorics problem.

Who is this page for?

Factorials

Factorials are used to solve the problem, “How many ways can I arrange n objects?” It makes the following assumptions:

The factorial function is defined as follows for positive integers:
Factorial(1) = 1
Factorial(n) = n * Factorial(n-1)

or, the naieve C implementation:

  // For 32 bit systems, will not return meaningful results
  // for values larger than 12!
  //
  // For 64-bit systems, will not return meaningful results
  // for values larger than 20!
  unsigned int factorial(unsigned int n)
  {
    unsigned int ret = 1;
 
    for(int i=2; i<=n; ++i)
        ret*=i;
 
    return ret;
  }

The C implementation above is fine for the values it can logically compute. However, when we get into arbitrary precision integer calculations, calculating 1 * 2 * 3 * … * n is inefficient! The number of words produced at the end of each step grows linearly. If we divide-and-conquer, we keep down the number of words-per-integer until the very last few multiplications, which produces significant time savings.

Of course, we would NEVER use this in the real world without extra sanity checking of the input. The importance of sanity checks in the real world is demonstrated every day by celebrities. Don’t commit their same blunders.

  mpz_class factorial(mpz_class begin, mpz_class end)
  {
    if(end==begin)
      return end;
 
    if(end-begin == 1)
      return begin*end;
 
    int half = (begin+end)/2;
    return factorial(begin, half) * factorial(half+1, end);
  }

In fact, running the two against each other to find 100,000! produces the following runtimes on my laptop (I was going to do 1 million and got bored waiting for the iterative solution to finish!):

  g++ factorial.cpp -O2 -lgmp -lgmpxx
  ./a.out
  Recursive factorial: 0.2795 seconds
  Iterative factorial: 3.01006 seconds

The most efficient ways to implement factorial algorithms turns out to be from its prime factorization, but the above is far easier than implementing sieving methods to find primes! Maybe another day :)

Permutations

Permutations are how we solve the problem, “how many different ways can I arrange k objects from a list of n objects? To give an example, assuming the same number could not be used twice in a combination, your “combination” lock from high school was actually a permutation lock. You were given a single permutation that you could use in order to open it. But how many different possible permutations were there, given 45 as a max value and 3 numbers?

It’s not hard to convince ourselves that the permutation is all of the numbers between and including 45 and 43 multiplied together. Symbolically, we are talking about all of the numbers between and including n and (n - k + 1), inclusive.

The formula we all learned in school (which you can convince yourself is equivalent to the above) is as follows:

  // Picking and ordering k objects from n objects
  mpz_class permutation(mpz_class n, mpz_class k)
  {
      return factorial(1, n) / factorial(1, k);
  }

Never do this! You have to recompute many of the values that you never use, and it is worthless and computationally wasteful to do so. Instead, you can use the following:

  mpz_class permutation(mpz_class n, mpz_class k)
  {
      if(k == 0)
          return 1;
 
      return factorial(n - k + 1, n);
  }

It’s much simpler. It computes much less. Use it. It is calling your name. Use it.

Combinations

Permutations are very closely related to combinations. In fact, there is only a one word difference in the definition. Combinations are unordered, and permutations are ordered. We can derive the formula by what this implies.

Let’s take our combination lock example. We needed to get three numbers in a row, in order. But what happens if we only need to select the three numbers without concern to the order? We know from the factorial section that the number of ways to order 3 numbers is 3 factorial. So let’s say we have the permutations { (1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1) }. These are all the same combination, because they all pick the same exact elements! So there is clearly 1/6 the number of combinations as permutations. 6 is 3!, which is no coincidence. In general, the formula for combinations is:

  mpz_class combination(mpz_class n, mpz_class k)
  {
      if(k == 0)
          return 1;
 
      return permutation(n, k)/factorial(1, k);
  }

*Hits the Easy Button*

Next week: More complicated number sequences such as the Binomial Coefficients and the Sterling Numbers

Popularity: 29% [?]

Comments

3 Responses to “Basic Combinatorics for Programmers”

  1. Michael Campbell on October 29th, 2007 7:22 am

    What the heck is “mpz_class”?

  2. jcb on November 2nd, 2007 6:49 am

    Your definition of factorial omits factorial(0)=1

  3. Manav Kataria on January 18th, 2008 3:17 pm

    Fantastic! Thankyou. It was a quick and clear description of the subtle differences I was looking for.

    I would like to recommend this to improve the blog further:
    Q) when to use x^n; (when repeatition is possible?)
    example: a coin tossed n times.

    Including some text on this would make the blog complete in my opinion! Nevertheless, great blog! Thanks!
    Manav

Leave a Reply