I remember when I first learnt that testing for primality is in P (as noted in the paper Primes is in P, which explains the AKS algorithm). Some time later,  I was talking with a close friend of mine (who has since received his bachelors in Computer Science). He had thought it was hard to believe that it was possible to determine whether a number was prime without factoring that number. That’s pretty cool. The AKS algorithm doesn’t even rely on anything really deep – it’s just a clever application of many (mostly) elementary results. Both of us were well aware of the fact that numbers are hard, as an understatement, to factor. My interest in factoring algorithms has suddenly surged again, so I’m going to look through some factoring algorithms (other than my interesting algorithm, that happens to be terribly slow).

The most fundamental of all factoring algorithms is to simply try lots of factors. One immediately sees that one only needs to try prime numbers up to the size \sqrt{n}. Of course, there is a problem – in order to only trial divide by primes, one needs to know which numbers are primes. So if one were to literally only divide by primes, one would either maintain a Sieve of Erosthanes or Perform something like AKS on each number to see if it’s prime. Or you could perform Miller-Rabin a few times (with different bases) to try ‘almost only’ primes (not in the measurable sense). As the prime number theorem says that \pi (n) ~ \dfrac{n}{logn}, and so one would expect about O(\sqrt{n}) or more bit operations. This is why we call this the trivial method.

The first major improvement didn’t come about until 1975, when John Pollard proposed a new algorithm, commonly referred to as the Pollard-\rho algorithm. Already, the complexity of the algorithm is far more intense than the previous. The main idea of the algorithm is the following observation: if we are trying to factor n and d is a relatively small factor of n, then it is likely there exist two numbers x_i and x_j such that d|(x_i - x_j) but n \not | (x_i - x_j), so that gcd(x_i - x_j, n) > 1 and a factor has been found. But how is this implemented if we don’t know what d|n?

That’s a very interesting question, and it has a sort of funny answer. Perhaps the naive way to use this idea would be to pick a random number a and another random number b. Then check to see if gcd(a-b,n) > 1. If it’s not, try a random number c, and check gcd(c-a,n) and gcd(c-b,n), and so on. This is of course very time-consuming, as on the jth step one needs to do j-1 gcd tests.

But we can be (far) wittier. This is one of those great insights that made me say – whoa, now there’s an idea – when I first heard it. Suppose that instead, we have some polynomial f(x) that we will use to pick our random numbers, i.e. we will choose our next random number x_n by the iterative method x_n = f(x_{n-1}). Then if we hit a point where x_j \equiv x_k \mod{d}, with k < j, then we will also have that f(x_j) \equiv f(x_k) \mod{d}. This is how the method got its name – after a few random numbers, the sequence will loop back in on itself just like the letter $atex \rho$.

Of course, the sequence couldn’t possibly go on for more than d numbers without having some repeat mod d. But the greatest reason why we use this function is because it allows us to reduce the number of gcd checks we need to perform. Suppose that the ‘length’ of the loop is l: i.e. if x_i \equiv x_j, with j > i, then l is the smallest positive integer such that x_{j+l} \equiv x_j \equiv x_i. Also suppose that the loop starts at the mth random number. Then if we are at the kth number, with $k \geq m, n|k$, then we are ‘in the loop’ so to speak. And since n|k, n|2k as well. So then x_{2k} \equiv x_k \mod{d}, and so gcd(x_{2k} - x_k, n) > 1.

Putting this together eans that we should check gcd(x_k - x_{k/2}, n) for every even k, and that’s that. Now we do not have to do k-1 gcd calculations on the kth number, but instead one gcd calculation on every other random number. We left out the detail about the polynomial f(x), which might seem a bit problematic. But most of the time, we just choose a polynomial of the form $f(x) = x^2 + a$, where a \not \equiv 0, -2 \mod{n}. (This just prevents hitting a degenerative sequence 1,1,1,1,1…).

Of course, there are a few additional improvements that can be made. This, which I have heard called the “Tortoise and the Hare” approach (named after the slow moving x_i being compared to the fast moving x_{2i}), is not the only way of finding cycles. There is a method called Brent’s Variant that finds cycles in a different way that reduces the number of modular multiplications. The key idea in his is to have the ‘tortoise’ sit at x_{2^i} and compare to the ‘hare’ who moves from x_{2^i + 1} up to x_{2^{i+1}}. Then the tortoise sits at the next 2 power. The main idea of the savings is that at each step, Brent’s algorithm only needs to evaluate f(x) once, while implementing Pollard’s algorithm requires 3 (one for the tortoise, two for the hare).

In addition, one might not want to perform Euclid’s Algorithm after each step. Instead, one might do 100 steps at a time, and then perform Euclid’s algorithm on the product (effectively replacing 99 gcd computations by 99 modular multiplications, which saves time).

There is also undoubtedly an art to choosing the polynomial well, but I don’t know it. Fortunately, this sort of algorithm can easily be implemented in parallel with other polynomials. Unfortunately, although it picks up small factors quickly, its worst case running time is poor. The complexity of the algorithm is such that as far as I know, its big O running time isn’t fully proven. The sudden jump in factoring! More to come –