So I know for all practical purposes that there are RNGs out there that work just fine.

What I want to know is if these RNGs were left alone long enough, wouldn’t they begin to repeat themselves in patterns?

So I know for all practical purposes that there are RNGs out there that work just fine.

What I want to know is if these RNGs were left alone long enough, wouldn’t they begin to repeat themselves in patterns?

Yep. That’s why they’re really called pseudorandom number generators. For most purposes, a good pseudorandom number generator is good enough, and we know a few good algorithms that are fast.

To get truly random data (for a particular value of “truly random”) you need something from nature. Cryptographically secure RNGs will usually rely on something like radioactive decay – the time between blips on a geiger counter can not be predicted.

Pseudo-random number generating algorithms will repeat.

True random number generating hardware will not. There is such a thing.

I know that pseudorandom generators are not truly random, but wonder whether the form their nonrandomness takes is always necessarily a very long cycle that is repeated. Can there be a form of pseudorandom generator that does not repeat (but has some other weakness)?

For example, are there generators that draw from chaos theory, in which numbers have fantastically obscure relationships to one another, but which were at some point deterministic, and which do not repeat?

I think that truly random number generators might have to rely on quantum mechanics at some point, because everything else in nature would be deterministic.

Wikipedia has a basic article on hardware random number generators. The one I’ve looked at the most is based on reverse-biased Zenner diodes. Given the right amount of voltage, a Zenner diode will just barely switch and allow current to flow backwards form the usual direction. The exact point fluctuates randomly. (Thermodynamics plays a role which is where the randomness comes form.) No radioactive substances needed.

But … the breakdown values are a bit skewed and differs for each diode so you have to treat it like an unfair coin (not at all a big deal) AND the breakdown point drifts with time which is much harder to correct for.

No. Any deterministic pseudo-random number generator will repeat. Chaos theory really has nothing to add to the mathematics of random number generators, in fact it was the other way around. The trick to notice is that the random number generator has a finite amount of state information inside it. Typically the last number generated, some other coefficients, and the algorithm. The next number is deterministically created from this information. Eventually the random number generator will exhaust the space of numbers it is able to represent. Either then, or usually sometime earlier it will create a number it created before. In the general case, it will exhaust the representation of all of its possible internal states. Then it will cycle.

The original question is much more subtle. In electronics, in general, as opposed to digital electronics, there most certainly are true random number generators. At least as random as our current understanding of physical processes allows them to be. Any thermal noise generator is a good start. A simple resistor has a noise signal present across it. Zener diodes have been a favourite source of noise. In principle all you need to do is apply one of these noise generators to a comparator and sample the noise. The comparator compares the noise signal to some fixed voltage reference. You tweak the circuit so that the voltage reference is equal to the average voltage level of the noise source. Then you start sampling. Each time you sample there is a 50% chance the noise signal will have a value above the reference. Instant random bits, and you have a true random number generator. If you need a 128 bit random number, you simply sample 128 times.

But there are significant subtleties. A random number generator needs to be properly random. The issue with the pseudo random number generators is twofold. One they cycle. Two, if you know the internal coefficients and the algorithm, you can predict the next number they will generate. Something that makes them bad for cryptographic needs. A random number generator based upon a noise source would appear to avoid both these issues. But in reality it has a significant flaw. It assumes that the implementation is perfect. In particular is assumes that there is no self correlated noise in the system. Self correlated noise is any signal for which the next value is in some way based upon what has come before. In ordinary electronics (and well known in audio) this might be as simple as a mains frequency hum (50 or 60 Hz, depending upon where you live.) If your random number generator’s reference voltage turned out to have a little bit of mains hum from the power supply on it, it will slightly skew the results. If you were to observe the random number stream over a long enough time period you would be able to measure exactly the level of the hum (or whatever signal it was.) And this would allow you to gain some knowledge about the likely distribution of numbers that the system creates.

The simplest flaw is an incorrect reference voltage. Here you would see a slight bias on the distribution of 1s and 0s. This would probably show up quite quickly. But the periodic self correlated noise, will also show up. The typical mechanism to find it is a Fourier transform. The standard version of which will pick up our mains hum trivially. Variants on the transform can pick up other forms of self correlated noise.

Once you have identified the self correlated noise you might have enough to gain an advantage over an otherwise strong cryptographic system (where the generator was used to create one time pads.)

The short answer is - there is no such thing as a free lunch.

How are we defining ‘random’ here? I don’t think randomness necessarily precludes repetition.

Only if one assumes that a PRNG must of necessity have a constant memory usage. If one allows the memory used to grow without bound - even at an extremely slow rate - the PRNG could have an infinite number of possible states. The algorithm would be more complex of course, to deal with variable size states, but I don’t think there’s anything inherently impossible about the concept.

Another possibility: as far as I know, the digits of pi do not repeat. If we take an ordinary, repeating PRNG and XOR the result with pi, it won’t repeat. It would probably be slow and not an improvement as far as predictability is concerned, but it won’t repeat.

One common approach is to take every possible source of “randomness” in a computer (measured temperature of the processor, time between keypresses or mouse clicks, compressed input from a camera or microphone, etc.), and use them to continually fill up an “entropy pool”, a file containing a whole bunch of “random” bits. Then, you do a Fourier-like transform on your set of bits, so that all of the patterns in the bits themselves are converted into patterns in the correlations between bits. Then you take the first n bits out of your total pool of m bits, and discard the rest (where the value of n depends on how much you trust your initial random sources).

Of course, bits from that entropy pool are a valuable resource, so unless you’re doing high-security cryptography or something of the sort, you generally can’t afford to get *all* of your bits from there. More common is to get a single number from there, and then use it as the seed of your pseudo-random number generator.

Of course it does. If knowing the last number a generator produced lets me predict what the next one will be, then the generator isn’t random.

But a RNG does not need to repeat the entire history it has been through just because it repeats a number. To take a trivial example, suppose we have a 10 digit generator, and we discard the first two digits and call it an 8 digit generator. When it repeats those 8 digits, the discarded digits are probably not repeated, and you’d expect the next 8 digit number to be different than the one that followed the first instance of those 8 digits.

Wow - I just contradicted my entire point, didn’t I?

I volunteer to be a random text generator, if anybody needs that…

There are a few different definitions, but the most relevant here are the statistician’s and computer scientist’s.

A statistician would say that a sequence of bits is random if they’re sufficiently hard to distinguish from the output of a true random process. Every reasonable PRNG has to meet this standard, but that doesn’t preclude eventual repetition.

A computer scientist would say that a sequence of bits is random if it’s incompressible: that is, there’s no program that will print the sequence S that’s significantly shorter than “print S”. Clearly this precludes any repeating sequence. No PRNG meets this standard, but unless you’re doing cryptography, you don’t generally care.

That’s an interesting definition (which I agree with and like), but it seems unnecessarily complicated, especially for someone who isn’t a computer scientist. I’d think that saying random is when you can’t predict the next value in a sequence from what has come before it, would be the more standard definition. Since PRNGs are by definition deterministic, they’re non-random.

I believe there are web sites that emit truly random digits. Truly? Well, they seem to depend on quantum processes that are random as far as we know. At least one I saw was based on the output of a lava lamp.

I once programmed a pseudo-random generator that was claimed to cycle only after 2^55 runs. That’s greater than 10^16. I tried it on bin filling and it was as random as I could determine. I no longer have the code, though.

Of course, it’s easy to write a number generator that only cycles after some extremely long time, but there are a bunch of other ways an RNG can be no good even with a long cycle.

Yes, RNGs can have very long cycles. The Mersenne Twister has a period of 2^19937 − 1, or approximately 4.3 × 10^6001. The number of particles in the visible universe is approximately 10^87. Mersenne Twister - Wikipedia

But how to you know what the 1,346th digit of pi is? If you’re looking it up in a table, then eventually you’ll get to the end of the table and have to start over again. Thereby (eventually) repeating.

If you’re calculating digits of pi using some formula, then you’re not doing anything fundamentally different from a computed PNRG; you’ve just added another little extra computation at the end of the original. So you’re still doing a PRNG, just a slightly more complex one, and are subject to the same laws that any PRNG is, and it will eventually repeat.

Clearly true, but for a physically realisable implemention impossible. Since by definition you need an unbounded amount of matter, and after you used the universe’s entire mass the generator would cycle. If you are allowed unbounded state you can simply fill that state with the unbounded sequence of true random numbers as the coefficents, and the algorithm is to simply select the next number in sequence . Pedantic in the extreme. But that is what these forums are for.

Touched on above is the notion of Algorithmic Information Content. Where the information content of a finite stream of data is defined not by the stream, but by the smallest program that can emit the stream. The interesting result is that there are two minimal information contents. One where the program is tiny, and its entopy zero, the other where the smallest program is the same size as the stream, and its entropy maximal. This links to the idea that a true random number sequence is incompressible (since if it were compressible the program to generate it would be smaller, and the AIC not zero.) It is also the same thing as saying you can’t predict the next number from what has gone before. Even if you needed every single number generated so far to do so. A true random stream has no information content.

A very sohisticated compression algorithm could contain knowledge of most of the common PRNG algorithms, and it could compare the stream with each of the algorithms, with the range of recommended coefficients. You could imagine a scenario where the compressor did find a sequence match. At this point it could conceivably perform perfect prediction, and close to infinite compression. Nobody bothers with this, because it is useless, but the idea is there. They don’t bother for compression, but when it comes to decryption this is very close in concept to brute forcing the key. The only key that is known to always be safe is a one time pad. And a one time pad must be truly random. Generating a sequence of pads from libmath:random would probably see you cracked after a while. Hence the reason why very high entropy (very low information) random number streams are valued. Traditional PRNGs have an entropy which is very low. Less than the number of bits needed to represent the executable code and the state.

Randomness certainly doesn’t preclude some repetition. What it precludes is predictability. A truly random sequence will, in fact, have repeated strings. The longer the sequence, the longer the maximum repeated string.

So, is “5” a random number?

“Zener” diode.