This is a continuation of my last factoring post. I thought that I might have been getting ahead of myself, as I left out a relatively simple idea that led to great gains in factorization. The key of this post is Fermat’s Factorization, and a continuation by R. Lehman.

The main idea here is very simple. When a number N can be written as the difference of two squares, i.e. N = a^2 - b^2, then we can factor N as N = (a+b)(a-b). And as long as neither are 1, it yields a proper factorization.


Now I mention a brief aside, not relevant to Fermat Factorization. There is another, lesser-known identity called Sophie Germain’s Identity:
a^4 + 4 b^4 = ( (a+b)^2 + b^2)((a - b)^2 +b^2) =
(a^2 + 2ab + 2b^2)(a^2 - 2ab + 2b^2)
By the way, a Sophie Germain prime is a prime p such that 2p + 1 is also a prime. It is unknown whether or not there are infinitely many Sophie Germain primes.

Now, the idea of this factorization method is to try different values of a in the hope that a^2 - N = c is a square, i.e. c = b^2 for some integer b. So the algorithm proper might be like:

1. Set a = \lceil \sqrt N \rceil
2. c = a^2 - N
3. Test if c is a square. If it is, look at a - \sqrt c and a + \sqrt c. If not, go to step 4.
4. Increase a by 1 and set c = a^2 - N . Return to step 3.

Of course, one should end the algorithm sometime if no factorization is found. So perhaps one should use a counter. But that’s besides the point.

Clearly, this algorithm works best when the factors are about the size of \sqrt N. Done naively, Fermat’s Algorithm works slower than trial diversion in the worst case. But there is a quick way to combine Fermat’s Algorithm and trial division that is faster than doing either separately. This is only the first of a few possible optimizations.

Let’s look at an example. Suppose we want to factor N = 123456789123 (astoundingly enough, 1234567891 is prime, and was going to be my first example). One sees that \lceil N \rceil = 351365. So with trial division, we would need to test numbers up to 351365. But let’s do a couple of iterations of Fermat’s Method:

1: 351365^2 - N = 574102; and \sqrt{574102} = 757.7
2: 351366^2 - N = 1276833; and \sqrt{1276833} = 1129.97
3: 351367^2 - N = 1979566 and \sqrt{1979566} = 1406.97
4: 351368^2 - N = 2682301 and \sqrt{2682301} = 1637.77

So no result yet, it might seem. But Fermat’s Method does not miss factors in the range that it sweeps through. Thus from these 4 iterations, we have looked for factors as low as 351368 - 1637.77 = 349730. So in just 4 iterations, we narrowed the range to be tested by trial division from every between 1 and 351365 to everything between 1 and 349730. 4 iterations reduces the 1000 most complicated trial division tests (there are 210 primes in that range, so at least that many).

In general, if one chooses a bound d > \sqrt N and uses Fermat to find factors between \sqrt N and d, then the resulting necessary bound for trial division is d - \sqrt{d^2 - N}.

There is another easy modification to speed up this test, involving a sieve. Let’s make a couple of observations.

First Note

Note that all squares end in 0, 1, 4, 5, 9,\; \mathrm{or} \; 16 \mod 20. Suppose we have calculated a particular value of a^2 - N \mod 20. The last two digits are cyclic too – they repeat with each increase by 10. So calculating a couple of values of a^2 - N allows us to decide which a to continue to calculate. This leads to calculating only a fraction of the a values and a fraction of the subsequent square root calculations.

There is nothing special about 20, really. Any modulus will do. But 10, 20, or (for larger values) a prime power. Or one can combine a few with the Chinese Remainder Theorem. Unfortunately, this only really changes the overall time complexity by a constant factor.

Second Note

While it is clear that Fermat’s Method works best for factors near \sqrt N, that doesn’t happen too often. But if one can somehow divine (or guess) an approximate ratio of the two factors, suppose something like \frac{p}{q}, then one can choose a rational number \frac{l}{m} near that value and then perform Fermat’s Method on Nlm = pm \cdot du, which should pull the factors cv \; \mathrm{and} \; du first.

This is turning out to be longer than I was expecting, so I’m splitting this post into 2 (that way I can finish the other half tomorrow). In the next, we will continue to improve upon Fermat’s Factorization, firstly by splitting up the interval wittily. Secondly, by coming up with a method that heuristically splits a composite number n in $O(n^{1/4 + \epsilon})$ steps by using a slight generalization. Finally, I have heard of an improvement of Grenier that allows polynomial time factorization if the primes are relatively close to each other. More on that in the next few days.