Specifically, I have a question about the RSA method. I read The Code Book a while back, which gave a pretty good explanation of how RSA and Diffie-Helman-Merkle exchanges work, but there was one detail that seemed noticbly missing to me.
In a nutshell, a person’s public key is the product of two very large prime numbers. The two prime factors are then used as the private key. The private key can be derived from the public key, but only by factoring the public key, which is so computationally expensive that for a very large number, it’s practically impossible (we hope). Likewise, the encryption process is a one-way function that is very nearly impossible to reverse without knowing the two prime numbers that were used to create the public key. Ok, that’s all great.
First question:
All of the descriptions I’ve read say something like “Alice simply picks two giant prime numbers and multiplies them”. But how do you “simply pick two giant prime numbers”? I’m not a mathematician, so maybe I’m missing something obvious, but it seems like the process for checking for “primeness” would be virtually identical to the process for prime factorization. How can you quickly generate a very large prime number without also being able to quickly factor very large numbers?
Second (unrelated) question:
The distribution of primes is, on a large scale, predictable. That is, we can’t predict exactly when prime number will occur with our present understanding, but the density of primes seems to follow a simple pattern defined by 1/ln(N). In short, there are an infinite number of primes, but they’re getting more spread-out the higher we go. In other words, primes are more frequent between 1 and 100 than between 200,000,000,000 and 200,000,000,100.
This made me think about the future of public key cryptography. Computers are getting faster, but it’s still taking longer and longer to find the next prime. At the same time, we keep increasing the key length (i.e. using larger pairs of primes) to keep out of reach of the faster computers’ ability to factor large numbers. But is it an equillibrium? Will there come a time at which we can’t generate new primes fast enough to keep our head above water?
I’m trying to imagine how a very large organization would go about trying to crack RSA. You could go about it directly by trying to factor the public key and derive the private key for decryption. But what if you precompiled multiplication tables for all known primes? Then you wouldn’t have to do factorization at all - you could just look up the public key in a database and see what its factors were.
It would be a monumental undertaking - I doubt it’s even possible if only because of the storage space required. Every new prime would double the size of the database. And searching that much data, even if it were indexed, might take longer than just doing the factoring. But traditionally, precompiling data tables like that has been used to drastically speed up cracking times. I haven’t even tried to calculate how many possible keys there would be to calculate, so I may be way off-base here. Maybe the size of the problem is still so large that even an organization like the NSA can’t do what I’m talking about, but I’m curious if this will ever be a problem. Will we reach a point at which we can calculate all the new key pairs a newly discovered prime will provide before the next prime is calculated?