In continuation of my previous post on factoring, I continue to explore these methods. From Pollard’s \rho method, we now consider Pollard’s p-1 algorithm.

Before we consider the algorithm proper, let’s consider some base concepts.

Firstly, I restate Euler’s Theorem (of course, Euler was prolific, and so there are altogether too many things called Euler’s Theorem – but so it goes):

If \mathrm {gcd} (a,n) = 1, i.e. if a and n are relatively prime, then a^{\phi (n)} \equiv 1 \mod n, where \phi (n) is the Euler Totient Function or Euler’s Phi Function (perhaps as opposed to someone else’s phi function?). As a corollary, we get Fermat’s Little Theorem, which says that if \mathrm {gcd} (a,p) = 1, with p a prime, then a^{p-1} \equiv 1 \mod p.

The second base concept is called smoothness. A number is called B-smooth is none of its prime factors is greater than B. A very natural question might be: why do we always use the letter B? I have no idea. Another good question might be, what’s an example? Well, the number 24 = 2^3 \cdot 3 is 3-smooth (and 4-smooth, 5-smooth, etc). We call a number B-power smooth if all prime powers p_i ^{n_i} dividing the number are less than or equal to B. So 24 is 8-power smooth, but not 7-power smooth. Note also that in this case, the number is a factor of \mathrm{lcm} (1, 2, 3, ..., B).

Pollard’s (p-1) algorithm is called the “p-1” algorithm because it is a specialty factorization algorithm. Suppose that we are trying to factorize a number N and we choose a positive integer B, and that there is a prime divisor p of N such that p-1 is B-power smooth (we choose B beforehand – it can’t be too big or the algorithm becomes computationally intractable). Now for a positive integer a coprime to p, we get a^{p-1} \equiv 1 \mod p. Since p-1 is B-power smooth, we know that (p-1) | m = \mathrm{lcm} (1, 2, ..., B). Thus a^m \equiv 1 \mod p, or rather p | (a^m - 1).

Thus p| \mathrm{gcd} (a^m - 1, N) > 1. And so one hopes that this factor is nontrivial and proper.

This is the key idea behind the entire algorithm.

Pollard’s (p-1) Algorithm
1. Choose a smoothness bound B (often something like 10^6)
2. Compute m = \mathrm{lcm} (1, 2, ..., B)
3. Set a = 2.
4. Compute x = a^m - 1 \mod N and g = \mathrm{gcd} (x, N)
5. If we’ve found a nontrivial factor, then that’s grand. If not, and if a < 20 (say), then replace a by a+1 and go back to step 4.

So how fast is the algorithm? Well, computing the lcm will likely take something in the order of O(B \; log_2 B) complexity (using the Sieve of Erosthanes, for example). The modular exponentiation will take O( \; (log_2 N)^2) time. Calculating the gcd takes only O( \; (log_2 N)^3), so the overall algorithm takes O(B \cdot lob_2 B \cdot (log_2 N)^2 + (log_2 N)^3) or so. In other words, it’s only efficient is B is small; but when B is small, fewer primes can be found.

I’ve read that Dixon’s Theorem guarantees that there is a probability of about \dfrac{1}{27} that a value of B the size of $N^{1/6}$ will yield a factorization. Unfortunately, this grows untenably large. In practice, this algorithm is an excellent way to eliminate the possibility of small factors (or to successfully divide out small factors). Using B = 10^6 will identify smaller factors and allows different techniques, such as the elliptic curve factorization algorithm, to try to find larger factors.

I’ll expand on more factoring later.