Technically speaking, are there any real random number generators? (In electronics)

As long as we’re doing pedantry here, the information content of a random string is equal to the length of the string. If you want minimal information content, you’d need a string like 11111…

I have noted to myself that humanity does not have, and will never have, a computer with computational ability equivalent to that of Turing machines, since that would require infinite memory. Isn’t it sad that there is an entire class of “computable” functions that are not, in fact, computable?

Digits of pi can be calculated starting from an arbitrary location without calculating all the intervening digits, so you don’t need a table. Since the digits of pi don’t repeat, neither will the resulting series of PRNs.

Mmy bbadd.

The “xor pi” method is actually an example of Sofis’s methods where the memory requirements grow without bound, though it might not be obvious at first blush. The key is that you have to keep track of which digit of pi you’re currently using: If you’re using the 27135424037th digit of pi, then you need enough memory to store the number 27135424037. As you go on, that number will get larger and larger, and (eventually) take up more and more memory space to store.

This also shows, incidentally, that the fact that the memory requirements grow without bound is not actually a problem, practically speaking, since nobody’s ever going to need so many random numbers that the memory requirements outgrow a standard computer.

To expand on this, if you write your infinite PRNG correctly, you should be able to make it so that it can generate numbers continuously from now until the matter making up the CPU is destroyed from proton decay before it reaches a kB of memory allocated.

That’s as good a definition as any. But we can measure the mean and variance and whatnot and make some informed predictions about what will come next. With a suitably large number of random events, we can make accurate macro-level predictions, which is all statistical physics is.

I just get sick of hearing things like “All the numbers are between 0 and 5 – that’s not random” and other misinformed statements. I’m not a computer scientist, but it seems to me that the definition of “random”, besides something vague like “unpredictable”, is only as good as your application.

I guess I just want to know what the OP means by a real random number generator. What makes throwing dice or flipping coins or sampling temperature and noise not really ‘real’? Why the qualifier? What is “really” random versus just “pseudo” random?

Psuedo random number generators are “psuedo” for very good reason. They are deterministic. If you run a progrm with a PRGN, and use the same seed, you get exactly the same number stream each time. This is a very important property, and one that is prized by the simulation community. It is also unavoidable.

If you are a cryptographer, you want the exact opposite. You convolve the text to be encrypted with some private algorithm and data. The recipient deconvolves it. What you want is to maximise the entropy on the ciphertext. If the entropy is low that means there is some traction available to a cracker to break the cypher. The actual algorithm used is not considered part of the cypher, that is subject to discovery by more mundane means, and generally should be considered to be publicly known. The secret is the coefficiants used. The only known fully secure encyption is a one time pad, which has a length the same as the text. Both sender and recipient have a copy. However the pad itself must be random in the sense that there should be no mechanism by which a cracker could manage to predict it. Even very subtle issues might provide enough traction over time. If you used a well known PRNG to generate the one time pads you would be in trouble. They have very low entropy. They might pass many useful statistical tests, and be useful for simulations and the like. But if there is any traction possible for an analyst to predict any trend in a the number stream used for encryption, it isn’t useful.

From a simple information theory point of view there is no deterministic algorithm that can create a one time pad that has any more entropy than the initial coefficiants used to generate it. However the computational complexity needed to find the coefficients may be very very large, and thus infeasible. So long as no-one discovers a mathematical gimmick that allows them to work them out easily. The classic problem is the one of finding the prime factors of very large numbers that are the product of two large primes. Work out how to do that easilly and you break a lot of public key encryption. This is because you have reduced the PRNG that is generate the one time pads to something easily predictable. And thus found the gimick that allows you to exploit the low entropy in the stream. Ultra secure one time pads are painstakingly generated from stochastic processes, and are, so far as anyone knows, utterly unbreakable. (Unless, by mistake, you use the pad twice, in which case they become trivially breakable. A mistake that has cost people dearly in the past.)

The goodness measure of a random number stream is the entropy. PRNG streams have no more entropy than embodied in their initial conditions. Usually a 32 or 64 bit number. Not much. Physical properties can have much more entropy, but as I wrote above, you have to be carefull.

Is it actually trivially easy in practice to break a twice-used pad? Or do you just mean that it’s trivially easy to show that it is in principle breakable?

By “goodness measure” you mean “goodness measure for cryptography”, right? The simulation community could care less about entropy. Or am I wrong?

(The sim community would presumably care about serial independence, fit with a given distribution (probably uniform, which is transformed into others) etc.)

If you use the pad twice the encrypted code becomes subject to simple statistical attack. Think of the problem as this. The text and the pad are symmetric. Encryption is the convolution of the two (usually a simple XOR). We usually think of protecting the text with a high entropy pad. But we can equally think in terms of having encoded the pad with the text (from the the symmetry.) One time is safe. But with two, we now have the pad encrypted with a very poor quality pad - the low entropy text. Plus more text encoded with the pad. Thus one piece of plain text is effectively only protected by the entopy in the other piece of plain text. It becomes possible to discover the pad and the text.

If the convolution is XOR, if you XOR the two pieces of cyphertext together, you actually eliminate the pad altogether. You end up with one bit of text encrypted by the other. This is now no harder to break than the old spy novel trick of using a page of text from a book as the encryption key. Trivial, well it isn’t really, it will tkae time and skill. But possible it is. Whereas before it was quite impossible to break. That is a huge change.

If you were a cryptanalyst, SOP in your operations would be to apply this attack to all streams of communication you encountered. People make simple mistakes. History has shown that it works. The Fish codes (German high command) codes broekn with the Colossus machines in WWII were reverse engineered due to very similar silly operational mistakes.

That is probably a good summary. About the only thng to add is that if you have high entropy you will have all the other goodness measures. Except for reproducability, which is pretty important.

Parallel random number generation is worth a mention. If you have a parallel computation, split across many concurrent nodes, you can get in to all sorts of strife if the PRNGs used on each node of the simulation have common patterns. You can end up with auto-correlated artefacts in the simulation that wreck it. A mate of mine spent a goodly amount of his life worrying about this one.