True Randomness

For actual scientific simulations, you want to know exactly what seeds you use, so that you could precisely duplicate any particular simulation. That’s typically some variant of the system time plus a salt for each “random” stream used, recorded in the output so you can rerun it if needed.

For generating a random cryptographic key, you really really need something uncorrelated with everything.

As with any system design, you need to figure out your requirements before you do anything else.

The problem with even seeding a PRNG with a true random seed is that the output stream implicitly contains the state of the RNG.
A sophisticated RNG may contain quite a bit if state. But, once the RNG has output as many bits of data as there is state, it will be close to having emitted enough information to derive its internal state. If all the bits in all the outputs were fully independent this would happen when exactly the same number of bits had been output. We might reasonably assume that it isn’t that good, but we can’t get away from the problem that it cannot create additional randomness, and the amount of entropy available is the amount that set the initial state.

Reversing out the internal state from a set of output values is assumed to be intrinsically hard. One might note that Bitcoin mining is engaging in a very similar task. Intrinsically hard does not mean impossible.

Worse, flaws, either accidental or intentional in the algorithm may lead to unexpected easing of difficulty in finding that state. It isn’t as if there isn’t a lot of interest in this either.

And we find ourselves in the same situation as with much preceding encryption. If the mechanism becomes viable to attack in the future, all existing recorded traffic becomes vulnerable to attack.

In all, we are stuck with the core problem of all cryptography. The only known perfect cypher is a one time pad. The key exchange problem never goes away. Everything else is just moving the deck chairs around. Where things can go wrong is in mistaking how much you are insulated from attack. Pushing your attack surface into a PRNG and trusting in how the seed is generated may be naive. The PRNG must be subject to exactly the same amount of rigour as the rest of the cryptography chain. A stochastic initialisation step affords the system little to no additional protection.

Indeed the PRNG is little different to any key amplification mechanism. But with a potentially easier attack surface as it also provides what is effectively an oracle to test against.

The question is: do one-way functions exist? This is unknown. If it’s true, then you can have secure PRNGs. Even though you could still in principle reverse out the internal state from the output stream, this would be a hard problem by definition. Keep 1024 bits of internal state and you couldn’t work out what it is with all the matter in the universe dedicated toward computronium. And you can trivially get secure encryption out of it by just XORing with the bitstream.

There’s also the question of quantum computers. Even if a one-way function exists, it may not be resistant to quantum attacks (i.e., fast factoring). Then again, practical quantum computers haven’t been proven to exist either.

Define “one-way function”? Because I can think of plenty of trivial examples of things that sure look to me like they’d be “one-way functions”, but they wouldn’t be useful for (pseudo-)random number generation.

Wikipedia has a reasonable definition here:

But basically, it’s whether, given the output of a function, can you find the inputs within a polynomial factor of brute force?

Integer multiplication is thought to be one-way for classical computers. Easy to compute in the forward direction, hard in reverse. Difficulty goes up exponentially with input length. At least that’s the thought. No one has proved this.

Hashing is similar. Given an output, is it possible to compute the input? The whole point is that this should be impossible, but no one has proved it. Some hashes have been broken and aren’t one way at all. Others aren’t broken… yet.

If there is such a thing as a provably secure hash, then you can come up with some initial random state, and then hash that with an index to give you a PRNG. It’s computationally infeasible to come up with the initial state no matter how many bits you’re given.

And possibly addressing this specifically: perhaps you’re thinking of something like f(x) = 0. I.e., non-injective. That’s not invertible, but usually the goal is that you can’t find any input given the output. For hashing, you can’t ever get the original text back (due to the pigeonhole principle). But mostly you don’t care about that; to break the security you just need to find some input that hashes to the same output. Or, better yet, being able to make a small alteration to a specific input so that the output matches.

First of all, you need to state the problem more clearly. E.g., I can easily present a reverse multiplication for any number no matter how large: 1 and the number. Note that any correct reverse example is acceptable.

Such silliness aside, to be a useful one way function, reversing in general has to be generally hard. So even if you rule out the above example and state the numbers are two or larger, you can still trivially reverse and multiple of 2, 3, 5 and such. So a very large percentage of products can easily be reversed.

What makes the factorization used in RSA hard to do is that the numbers are specified to be primes of about the same size, but not too close. Even then there are tricks to factor a few of those products. So after you choose your large primes, some testing should be done. And even that doesn’t guarantee that the NSA, for example, doesn’t have some tricks to factor some such products here and there that they haven’t told anybody. But at least the ratio of hard to factor numbers seems pretty decent.

I mean, we can’t rule out the possibility that the NSA has a classical linear-time algorithm that can factor any number. We don’t think that’s possible, but nobody’s ever proven it.

My take about all those kinds of questions is:

They’re good. They’re not that good.

IME that’s a decent rule of thumb for every adversary.

It would be nice to not have to depend on rules of thumb, though. If a one-way hash could be proven to exist, you don’t have to pin your security on what you think is most likely. You’re secure even in the worst-case scenario.

Even better if your one-way function is hardened against quantum computers. We don’t know if they’ll ever be practical, but it’s probably best to assume they will be, and harden things now against that future possibility.