Question about 'random number generators' and their seeds (with respect to real world cryptography)

KarlGauss · December 11, 2013, 11:40pm

I found this article very interesting. As I understand it, the thrust of the piece is that the NSA has been in cahoots with at least two chip makers (Intel and Via) with the goal of making the output of those companies’ random number generator seeds somewhat ‘less than random’ i.e. “lessening their entropy”. The idea being, of course, that the folks at the NSA being aware of the nature of the seeds’ relatively low entropy would thereby be able to exploit it - essentially a back door to the RNGs.

My question is not about the NSA or the legality (or morality) of its operations.

Rather, I am a curious to learn why anyone who is serious about their privacy, and wishes to encrypt their secrets such that they are, effectively, unbreakable using current (and foreseeable future) software/hardware would choose to use any source other than their own “locally-generated” data to seed and operate their RNG.

Phrased differently, given that the output of RNGs depends critically on their seeds, the absolutely pivotal, essential nature of the seed’s randomness seems obvious. So, why would you ever use someone else’s seeds for your RNG-generating algorithms? Why not use seeds generated in and by your own local environment (e.g. some function of the heat produced from your computer itself, or even better, seeds derived from the radioactive decay of some locally-held radioisotope)?

Is seed production as alluded to in the preceding paragraph not easy to do? Expensive? Difficult to maintain? What is it that prevents companies, organizations and the like from generating their own seeds given the huge risks inherent in using those produced by others (as the linked article neatly reveals)?

Thanks!

sailor · December 12, 2013, 1:10am

I use PGP which asks you to move the mouse around. In fact I just created an encrypted virtual disk and I had to do that.

Trinopus · December 12, 2013, 1:24am

Unless you use radioactive decay, there will never be any truly random seed (or random number.)

In addition, a lot of people have no clear idea about this kind of thing at all. “password” is still an incredibly commonplace password. (A radio show I was listening to mentioned that “letmein” was also used an awful lot.)

I used to base my pseudo-random-number seeds on my PC’s millisecond clock, until I was advised this isn’t as good as it might seem. How was I to know? The science is advancing right under our feet!

RaftPeople · December 12, 2013, 1:27am

Is this correct?

I believe what makes or breaks an RNG is the algorithm, the seed merely tells you where to start in the sequence.

I don’t need to be “random” when I seed it, I just need to be different.

Dr.Strangelove · December 12, 2013, 1:44am

That is in fact how HW RNGs work–they use some circuit which experiences random variation from thermal fluctuations. There are several varieties but here is Intel’s approach.

The problem is that because the design is closed, there’s no way to verify that results haven’t been messed with somehow. RNGs require digital conditioning to remove bias, among other things. It’s here that there is a conceivable backdoor.

Sure, someone could build their own RNG. It’s actually incredibly easy. But most companies just don’t have the inclination or expertise. So like anything else they pay for someone else to provide the service. And again like anything else there’s the risk of a violation in trust. After the leaks, no one trusts HW RNGs any more, so the software makers are adapting.

rbroome · December 12, 2013, 1:59am

KarlGauss:

I found this article very interesting. As I understand it, the thrust of the piece is that the NSA has been in cahoots with at least two chip makers (Intel and Via) with the goal of making the output of those companies’ random number generator seeds somewhat ‘less than random’ i.e. “lessening their entropy”. The idea being, of course, that the folks at the NSA being aware of the nature of the seeds’ relatively low entropy would thereby be able to exploit it - essentially a back door to the RNGs.

My question is not about the NSA or the legality (or morality) of its operations.

Rather, I am a curious to learn why anyone who is serious about their privacy, and wishes to encrypt their secrets such that they are, effectively, unbreakable using current (and foreseeable future) software/hardware would choose to use any source other than their own “locally-generated” data to seed and operate their RNG.

Phrased differently, given that the output of RNGs depends critically on their seeds, the absolutely pivotal, essential nature of the seed’s randomness seems obvious. So, why would you ever use someone else’s seeds for your RNG-generating algorithms? Why not use seeds generated in and by your own local environment (e.g. some function of the heat produced from your computer itself, or even better, seeds derived from the radioactive decay of some locally-held radioisotope)?

Is seed production as alluded to in the preceding paragraph not easy to do? Expensive? Difficult to maintain? What is it that prevents companies, organizations and the like from generating their own seeds given the huge risks inherent in using those produced by others (as the linked article neatly reveals)?

Thanks!

As I understand it, developing a true random number generator is very difficult. Proving that it is truly random is tough as well. Showing a given set of numbers isn’t random is pretty easy though. The point is that hardware is the best way to generate a random number, and doing that efficiently and provably takes a lot of effort. If Intel is going to do all the work, why not use it? Now that NSA may have gotten into the act, people aren’t willing to trust it. But just doing it by yourself is tough.

jovan · December 12, 2013, 2:01am

But if you know where the sequence starts, the numbers are completely predictable, hence not random. That’s why they’re pseudo-random number generators. Bad seeds cost the Montreal casino over $600,000 in a hard-earned lesson about randomness.

KarlGauss · December 12, 2013, 2:53am

Thanks, I appreciate your answers.

There’s been some disagreement in the response about whether RNGs are “easy” to build or not, and also about whether the seed or the algorithm is key. I will appeal to authority and quote the Intel engineers from the article linked above (and again below in this post) who say it’s the seed that counts: “Researchers have managed to devise pseudorandom-number generators that are considered cryptographically secure. But you must still start them off using a special seed value; otherwise, they’ll always generate the same list of numbers. And for that seed, you really want something that’s impossible to predict.”

One of the comments on that article, the comment already two months old, was interesting: “NSA have been colluding with Intel to develop this RNG. So don’t use any output from it or they’ll break your encryption. Never trust a closed design, closed source black box for your random numbers.”

I can’t say understood the techniques described in the article but, even from my position of ignorance, ISTM they are doing a lot of ad hoc ‘correcting’ of the randomly generated signal. That seems like an invitation for either introducing some non-random element or, what actually did happen, allowing someone to deliberately screw with the process.

Chronos · December 12, 2013, 3:18am

The seed is more important than the algorithm. With an ideal seed, it wouldn’t matter what algorithm you used, as long as it wasn’t idiotic or deliberately sabotaged. With an ideal algorithm, though, a predictable seed will still lead to predictable results.

The catch is that improving your algorithm is usually a lot cheaper and easier than improving your seed. Better algorithms are just software, and can be run fairly quickly, but better seeds usually means custom hardware, which is generally pretty slow.

Shagnasty · December 12, 2013, 3:33am

You make both a valid point and an invalid point. True random number generators were once hard to imagine but anyone can buy them or even make one these days. You can buy radioactive decay detectors that simply plug into any computer via a USB port for a few hundred to a few thousand dollars if you want your own equipment or you can use a web service that does it for you. Given no hijinks or NSA involvement, those solutions will give truly random seeds.

The practical problem is that most people don’t care enough about true random numbers to use those solutions. Pseudo-random numbers were almost always good enough for most practical purposes until the U.S. government and NSA decided they could dedicate massive resources to cracking whatever key they want to. Most security solutions still depend on pseudo-random seeds and those are crackable especially if they are intentionally predictable through a conspiracy like is being described (that is an example of a true conspiracy BTW; a few of them really are true and the NSA is guilty as they come).

When I was growing in the 1980’s, people used to tell stories about the lengths the Soviet Union KGB would go to to spy on their own people. Now, we have it to courtesy of the NSA. I haven’t heard from Russian comedian Yakov Smirnoff in a while but he isn’t going to have to make new jokes. “In Soviet Russia, TV watches You!” is now “In free America, computer watches You!”.

Still, anyone that is tech savvy and wants to go through the effort can beat any of the NSA’s cracking efforts. The strong defenses are unbeatable through known technology as long as you have a true random number generator and a trusted algorithm.

KarlGauss · December 12, 2013, 3:36am

That is exactly the type of thing I was wondering about. I had no idea it existed (in such a simple format). Thank you.

Dr.Strangelove · December 12, 2013, 4:10am

One significant factor is the throughput you need. A gigabit/second RNG is fairly difficult. A few hundred bits/sec are easy, though, and good enough to use as a seed. A Zener diode, resistor, and ADC accomplishes this easily.

The postprocessing is often called “whitening” and can eat into your throughput. Suppose your RNG is biased and produces 1 bits 2/3 of the time, and 0 bits 1/3 of the time. Obviously that’s not acceptable, but if you instead take a bunch of bits and XOR them together, you can asymptotically approach an even distribution. Of course that reduces your throughput, but if all you need is a 256-bit seed on program startup, you can afford to waste some bits.

Intel vaguely describes their whitening algorithm–they say that they split each 512-bit chunk into two 256-bit chunks, then apply some algorithm to combine the two. I dunno if they describe the algorithm in any more detail somewhere else, but even if they did, the design is still closed and hence impossible to verify.

RaftPeople · December 12, 2013, 4:32am

They used the SAME seed every morning.

My point stands: a SEED does not need to be RANDOM, just different. As long as you can guarantee you aren’t re-using seeds (ever or often) then you won’t need to worry about how random the seed is, the PRNG algorithm provides the randomness.

septimus · December 12, 2013, 4:40am

Unless I am mistaken, most of the posts in this thread misunderstand the nature of the “backdoor.”

A random sequence f[sub]t[/sub] = f[sub]t[/sub](seed, PRNG) depends on a pair (seed, PRNG) with the PRNG published and the seed a one-time random throwaway. If the seed is exposed to eavesdropping, the entire sequence is exposed, but that’s not how the exploit in question works.

Instead the exploit observes the random sequence f[sub]t[/sub], knows what PRNG is, and exploits the computational weaknesses of PRNG to deduce the seed. This means that knowing some past values of f[sub]t[/sub] (t < T), allows one to know its future values as well (t > T).

I think.

What the NSA did was to encourage use of a PRNG for which they knew how to do the reverse computations needed to recover the seed. I am certainly not an expert on cryptology, but IIUC a simple way to make such reverse computation more difficult is to simply run two different PRNG’s with different seeds and XOR their results together. Lack of this and other safeguards led experts to deduce that Dual EC_DRBG was deliberately crippled from the beginning; the fact that NSA can decrypt these streams is probably “old news.”

Francis_Vaughan · December 12, 2013, 5:06am

This is sort of true, but easily misunderstood as well. There are all sorts of evil subtleties.

In cryptography you always assume that the eavesdropper knows all the algorithms in use. Thus you assume that the various government agencies know what PRNG you are using, and what encryption.

If you know what the PRNG is, you can guarantee to always get exactly the same sequence of numbers if you know the seed. Thus the PRNG adds exactly zero additional security. The use of a PRNG comes about in communication because of mundane issues. However the number that the PRNG is seeded with is the cryptographic key. The entire security of the communication depends upon the seed being secure. If you can guess the seed, you break the encryption. If the search space of the seed is artificially limited it becomes viable to break the encryption.

The constraints on the seed are clear - never ever reuse them. There are attacks that can recover the seed if two separate messages use the same seed. The seeds must be impossible to guess. Yes they must be different, but there are only two choices in choosing a new seed. A process that is deterministic to some extent, or a process that no determinism, and is truly random.

That gets us back to the first assumption. We assume that the eavesdropper knows all our algorithms. So, and way of choosing a new seed that is in any way deterministic is by definition algorithmic (there is some form of expression that captures the deterministic component). The eavsdropper knows any such expression, and thus is able to use this knowledge to limit the search space of seeds. The only way to prevent such an attack it to choose a truley random seed. This is expressed as the amount of entropy in the seed. A perfect seed choice has as much entropy as the maximum information content of the seed. That is really hard. So we at least set limits on the minimum amount of entropy we will accept in a seed.

So getting back to the accusation. If the process implemented in the chips artificially limited the space of random numbers generated, you have a clear backdoor. Such a backdoor could be very hard to detect.

The problem with random numbers and psuedo-random numbers is that in order to prove them non-random you need to find an autocorrelation mechanism. Simply put, given the history of previous numbers can you make a prediction about the next number to be generated that is better than pure chance. For psuedo-random numbers generators the answer is always yes - but actually finding the algorithm is arbitrarily hard. Once you find it, there is no argument. But finding it can be as hard as breaking the encryption. What passes for a true random number source could be perverted in a manner that is essentially impossible to detect. The sun would go cold before you managed to generate enough numbers and search all the possible algorithms. But, if you knew the algorithm, you could prove it was perverted instantly. Reverse engineering the chip might get you there, but that isn’t easy. A stochastic perversion, one that simply provides a few billion possible answers that the eavesdropper knew to search across would be as good as a deterministic algorithm in the chip as a backdoor, but even harder to reverse out and prove the existence of.

Chronos · December 12, 2013, 2:33pm

That depends on how often you re-seed your PRNG. If you’re really concerned about security, then you don’t produce any more bits of output than you input in your seed. In this case, it doesn’t matter if someone can use a known output to determine the seed used to produce that output, because you’re not going to be using the same seed again. It’s only when you get greedy and run the PRNG repeatedly, using its previous output as the seed for the next run, that this becomes an issue. Of course, true randomness is expensive, and you often don’t need nearly that much security, so sometimes this is justified… Just make sure you know the risks, and can accept them.

RaftPeople · December 12, 2013, 6:17pm

But, isn’t it also true that changing the seed introduces the possibility of reducing the randomness of the generated sequence?

The more frequently you change the seed, the more you are shifting your reliance on pseudo-randomness from the PRNG algorithm to your seed algorithm.

At the extreme, if you changed the seed for every new PRNG number needed, you’ve just eliminated the PRNG algorithm and created a new PRNG algorithm that is probably not nearly as random.

Half_Man_Half_Wit · December 12, 2013, 7:55pm

There are also commercially available solutions that use single-photons randomly being transmitted or reflected by a beam splitter, such as ID Quantique’s QUANTIS.

KarlGauss · December 12, 2013, 9:35pm

How can that be the case, though, if the seed was ‘truly’ random (e.g. derived from radioisotope decay or photon beam splitting?

RaftPeople · December 12, 2013, 9:58pm

If your seed algorithm is truly random then there is no need for a PRNG, you already have a real RNG.

So the assumption is that the use of a PRNG is because you don’t have a truly random sequence of numbers.

If that is the case, then the PRNG is mathematically constructed to be substantially close to (but not actually) random.

Because there was a lot of thought and testing that went into the PRNG to confirm randomness, if you start monkeying with the seed frequently and non-randomly, the result could be a reduction in the randomness achieved by the PRNG.

At the extreme, someone non-randomly “seeding” their PRNG for every new number generated has possibly just eliminated most of the value of the PRNG.

Topic		Replies	Views
How do encryption algorithms pick prime numbers? Factual Questions	69	714	September 26, 2024
What's the difference between a random number and a pseudo-random number? Factual Questions	62	21365	December 27, 2011
Public Key Encryption process Factual Questions	16	1482	July 26, 2003
Please explain encryption to me. Factual Questions	27	1877	August 16, 2008
Technically speaking, are there any real random number generators? (In electronics) Factual Questions	31	2500	September 22, 2009

Question about 'random number generators' and their seeds (with respect to real world cryptography)

Related topics