How do encryption algorithms pick prime numbers?

If you are picking p and q for RSA keys, there are some details to worry about, like exactly how big each one should be and of what form (to avoid weak keys), but the final algorithm still requires you to be able to find random primes of a predetermined size.

If you want to see a real life public key you can look at the certificate for the straitdope site. How you do it will vary by OS and browser. For Firefox on Windows click on the padlock in the address bar. Click on “connection secure” then “more information” then “view certificate”.

For a real deep dive install Wireshark on your machine, start it listening on port 443, then open the dope. You’ll be able to see the TLS handshake happen.

One other trick of finding a prime number is the jump size.

For example, you know that every other number is divisible by 2 so the primes can only exist on an odd number:

XOXOXOXOXOXOX

And then you also know that every other other number is divisible by 3:

XOXXXOXOXXXOX

Code that I’ve seen would find a number not divisible by 2 nor 3 and then move by six but, technically, it would be better to alternate jumps of 2 and 4. If you were to exclude multiples of 5, there would be some more convoluted pattern that would allow you to skip past known duds more quickly.

The primes are what remain after removing patterns from the number line. That also means that there’s regular patterns to take advantage of to more quickly find candidates.

At some point you’re just adding a Sieve of Eratosthenes-like step to the algorithm, which I suppose is relatively efficient for what it is, but Fermat primality tests are also relatively efficient. I’d expect that after casting out at most a dozen small primes it becomes more efficient to just do more primality tests.

That’s, presumably, why the code just incremented by 6.

The “key” for an encryption string may be arbitrarily complicated; for example the public key for El Gamal encryption is four numbers. Theoretically one could encode any list of numbers in a single number using some simple packing scheme but there’s normally no reason to do this.

@Chronos is right to point out that any kind of simple sieving or searching will upset the uniform distribution: think about that will happen in the case of twin primes. The library function I examined starts at a random point and scans upward, so it is more likely to find the first one.

That is my understanding. What public keys encrypt is the key used to encrypt the message. The algorithms used to encrypt the actual message are much faster and much stronger than the algorithms used to hide the key.

I forget if the example I saw was in a library or more just reference code (e.g. by a crypto student). I have no personal dog in the fight.

But in the case of a twin prime, if you’re starting from a random spot that’s not divisible by 2 nor 3 and jumping by increments of six, for any twin prime, you’ll hit the bottom twin half of the time and the top twin half of the time. That doesn’t affect anything and there’s no way to jump over one entirely so twin primes are neither preferred nor selected against.

If there’s a way to exploit that, I’m not seeing it?

Ah, that I’m slightly more likely to include twin primes, because I can’t skip them, whereas I might skip others. D’oh.

So, yeah, the 2-4 loop would be preferable.

And the point is that yes, you could factor the product of the primes… but if we’re talking say, 1024 bits… at 2^10 ~=10^3, that a number 2^1024 is in the order of 16x10^306. If the first bit has to be 1 and the last bit has to be 1, that’s still 4x10^306 numbers to check. Minus, of course, multiples of 3,5,7,etc.

The problem is to be sure your always randomly pick a number somewhere in that range, different every time - otherwise, that’s another vulnerability in the process of finding the primes. Quite often, th problem lies not in finding the primes but an accidental predictability of the result.

I recal one encryption discussion I saw, where the problem boiled down the the flaw that one step resulted in there being on 2^16 (65536) possible numbers to test.

More general versions of this are already covered up thread starting with the 4th post.

What programs still do this? I don’t think I’ve seen that for about thirty years. There are enough good sources of entropy in a modern OS without needing to explicitly ask for manual intervention by the user.

At least in the linux world /dev/random is meant to provide good random bytes. IIRC the OS is ambiently sending data from keyboard presses/mouse clicks into its entropy pool all the time.

Even after you discard the evens, it’s still every third number that’s divisible by 3, not every other one.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Discard the evens:
1 X 3 X 5 X 7 X 9 X 11 X 13 X 15 X

Discard the threevens:

1 X X X 5 X 7 X X X 11 X 13 X X X

Faster, maybe, but no encryption can ever be stronger than how hidden the key is. The absolute strongest (in fact, perfectly strong) encryption method is a one-time pad, but there, there’s no benefit in encrypting the key and using the key to encrypt the actual message, because a one-time pad needs a key at least as long as the message.

In practice, you usually accept using a technique that’s less strong than end-to-end RSA (but still very strong), in order to save computation time. But when it really counts, folks can and do use full end-to-end RSA (or other strong public/private key algorithms).

That’s one source of entropy. And yes, personal computers usually have plenty of entropy available, from various sources.

I just tested GPG, without any exotic options, and it warns you that

We need to generate a lot of random bytes. It is a good idea to perform
some other action (type on the keyboard, move the mouse, utilize the
disks) during the prime generation; this gives the random number
generator a better chance to gain enough entropy.

AIUI the library does also make sure (through interaction with the operating system, if it is reading from the system random device) that there is in fact enough entropy, or at least is supposed to.

I think that the idea there is that, while the encryption program is generating the primes, you open up that term paper you’re working on, or play a game, or something like that, not that you just mash the keyboard purely for the sake of entropy. “Random” keyboard-mashing probably isn’t even a very good entropy source, anyway.

Most of the entropy is generated by micro-timing events in the kernel. How many clocks between keyboard, mouse, network events, etc.

I don’t think it’s so much about the strength of the encryption, but about key management. Use an expensive asymmetric encryption, such as RSA, to share a key for a cheaper symmetric cipher, such as AES.

Unless something causes the shared key to be exposed or there is an implementation bug, I don’t think anyone is going to crack AES-128 anytime soon.

That matches what I wrote. I think you misread something or were assuming that I was starting from 1.