Number Theory, Hash Tables, and Geometric Progressions

Posted on September 30, 2007
Filed Under Computer Science, Math |

Or, \phi and Loathing in Los Vegas

What will this article focus on?

This particular article looks at geometric sequences (mod n), and how we can use them instead of linear hashes. A geometric sequence is simply a sequence of powers of some number: 1, a, a^2, a^3, … So instead of adding the same number together a bunch of times, we’re multiplying it together a bunch of times. And then you subtract one. More on that below!

First, the math

Euler’s Phi Function

When Euler was attempting to generalize Fermat’s Little Theorem, he defined a function using the Greek symbol \phi (pronounced fee by most people I’ve encountered). It has a simple job: it takes in a natural number, n, and returns the number of positive integers less than n that are relatively prime to n. In this article, we are not concerned with \phi’s calculation for anything but prime numbers.

It is easy to show that \phi(p) = p-1 when p is prime: all numbers less than a prime are relatively prime to the prime in question, otherwise it wouldn’t be prime! Easy proof.

Euler’s phi function is of vital to the RSA encryption algorithm, and is the cornerstone of the generalization of Fermat’s Little Theorem, but it makes cameo appearances in many other areas of mathematics.

Examples:

\phi(5) = 4, because gcd(5, 1) = gcd(5, 2) = gcd(5, 3), = gcd(5, 4) = 1.

\phi(6) = 2, because gcd(6, 1) = gcd(6, 5) = 1, but gcd(6, 2) = 2, gcd(6, 3) = 3, and gcd(6, 4) = 2.

Order of a number (mod n)

The order of a number (mod n), where n is an integer, is the smallest positive value of x such that s^x \equiv 1(mod\ p). If it is never equal to 1, it is considered infinite. 6 (mod 10) is an example that never has an answer. Note that this still has a solution under Euler’s generalization of Fermat’s Little Theorem. The laws of the universe won’t let you off that easy.

Example:

The order of 2 (mod 7) is 3, because [Unparseable or potentially dangerous latex formula. Error 1 ](prime) = prime-1, so \phi(p) = p-1. If order(m) (mod p) is p-1, that means that m is a generator for all numbers (mod p) except p itself! Since this will not generate p, and 0 by extension, (since they are in the same congruence class), we must subtract our result by 1. So our generator is m, and our hash function is a^x - 1(mod\ prime)

It is not true that all numbers have a primitive root, but it WAS proved by Legendre that every prime has at least one generator (mod p). Interestingly, according to my college Number Theory textbook, Euler tried his hand at the proof, but was incorrect. To the uninitiated into the Cult of Euler, this would be akin to a team of Michael Jordan clones failing to score a single point in a basketball game against a team of middle school students.

We need to find one such that the first time this happens is for a power of p-1. Instead of testing every power, we can instead (because of this proof), just test powers where the power divides p-1. If we were looking mod 9, and we knew 3^8 == 1(mod p) (which it has to be because of Fermat’s Little Theorem), then 3^1, 3^2, 3^4, and 3^8 are the only possible powers that can be equal to one. We will call this the generator test. We can check these particular values quickly through successive squaring. If any of the powers of 3 less than 8 are congruent to 1, then we have a failure, and it is not a generator.

If you do not have access to a good way to factor p-1, the following naive method will work well for small numbers. Please note that the preferable way is to factor p-1 and to find all of the divisors of p-1 that way.

// *********************************************************************** 
// Precondition: p is a prime. If it is not, it will return 0 indicating 
// failure 
// 
// This assumes that you are trying to do this for a small p, without being 
// able to factorize p-1 quickly. 
// ************************************************************************ 
unsigned int find_generator(int p) 
{ 
  int phi_p(p-1); 
  std::vector<int> test_powers; 
 
  int i; 
 
  for(i=1; i<phi_p; ++i)    { 
      if(!(phi_p % i)) 
	{ 
	  test_powers.push_back(i); 
	} 
    } 
 
  int test = 1; 
 
  bool found=false; 
 
  while(!found) 
  { 
      test++; 
      found = true; 
 
      for(i=test_powers.size()-1; i<0; --i) 
      { 
        if(powmod(test, test_powers[i], p) == 1) 
	{ 
	      found = false; 
	      break; 
	} 
      } 
 
      if(found) 
      { 
	  return test; 
      } 
   } 
}

So what?

If we have an element a (mod n) who has a^{n-1} = 1, and a^{positive\ integer\ less\ than\ n} is not equal to 1, we have a generator. The generator is for a set of integers of size (p-1), which is even.

Finding generators is nontrivial

A downside to this method is that there is no free lunch when it comes to finding generators. You have to find one, although fortunately for us, most numbers have generators that are less than 10, so you can find them by linearly searching. There are a few strategies of how we can pick primes that will allow us to (relatively) quickly find a generator (mod p). The one I use is:

One strategy is finding a prime, p, such that q = 2*p + 1 is also prime. The only two numbers that you have to check that violate our generator condition are 2 and p, in which case q is a generator. This helps reduce the complexity of the test. How do we know if our numbers are prime? Probabilistic primality testing, of course :). It’s amazing how all of this stuff ties together.

A professor I had for a cryptology course said that the odds of the first generator NOT being less than 10 has been shown to be inordinately small, but I can’t for the life of me find any sort of reference to a figure that states that. As there is no trivial way to find a hash function, it is acceptable to search for the first generator (mod p) linearly, using our generator test, if you are looking for just any generator of p. Likewise, you can also find the largest such generator (mod p) by reverse searching.

This is so complicated. Why would I use this over a linear hash?

Sometime in the future, (not in the next post, though), I will develop benchmarks to see what is better to deal with various different input scenarios. There’s no sense in developing the mathematics if we don’t actually put it all on the line and see if the “better” method works better in the real world. The real world has an amazing way of yelling “surprise!”, but we can limit that surprise through testing, testing, testing.

Popularity: 48% [?]

Comments

3 Responses to “Number Theory, Hash Tables, and Geometric Progressions”

  1. Eric Jablow on October 1st, 2007 8:45 am

    Mersenne primes are primes of the form 2^{p}-1, not 2^{p}+1.

  2. cookbook on October 1st, 2007 2:12 pm

    Is there a cookbook of well known generators? This would solve many problems of mine.

  3. Jake on October 1st, 2007 4:08 pm

    http://books.google.com/books?id=-9pg-4Pa19IC&pg=PA476&lpg=PA476&dq=list+of+smallest+primitive+roots&source=web&ots=Aq1PMtxHm_&sig=KCH55f8ODuQ0UjSQRSWAQP2iE3w#PPA606,M1

    The above link (page 606) has a list of generators less than 1000, but I’m guessing that you need to find primitive roots of larger primes. This is generally just done through testing. Primes are guaranteed to have phi(phi(p)) primitive roots, and they can generally be found quickly if you can factor phi(p), as I mention above, though I wouldn’t recommend using the code in the post, as it doesn’t rely on either primality testing or factoring.

Leave a Reply