What's the difference between a random number and a pseudo-random number?

I’ve heard about “seeds,” but I still don’t get it.

A sequence of pseudo-random numbers

  1. will repeat eventually.
  2. can always be predicted based on the current ‘state’ (i.e. seed).

True randomness cannot be predicted.

Given the same seed, a PRNG will always produce the same sequence. That’s why seeds are important, because sometimes you want things to occur in an apparently random fashion from the point of view of the system, but exactly the way they did before so as to compare two systems’ responses.

A pseudorandom number sequence is difficult to predict, and has some statistical properties associated with randomness. A true random number sequence is impossible to predict, not just difficult.

And for a practical example of where seeding of a PRNG is useful, consider the FreeCell game on your computer. When you run it, you’ll get a puzzle number listed at the top of your screen, and anyone playing that exact puzzle number will always get that exact same deal. The folks at Microsoft didn’t just load up a list of tens of thousands or billions or whatever of deck orders; the game number is instead used as the seed for the PRNG that’s used to shuffle the cards. Same seed, same shuffle, and people can compare how well they do on a particular deal without communicating the entire deck.

The above answers are good, but there’s also they bring up the question: why would I want a PRNG instead of a “real” random number generator, then?

The answer is that a deterministic machine like a computer can’t be programmed to come up with true random numbers, since a program always runs in a predictable way.

However, there are ways to fake it. For example, high-end cryptography devices will use a little box with a small amount of radioactive material in it, and use the timings from radioactive decay events to seed the computer’s PRNG. Radioactive decay is a truly random process; you can never predict when the next particle will go flying off. This is known as an entropy source.

Modern operating systems have somewhat less fancy entropy sources. They’ll use things like mouse movements, timings between keypresses, the size of arbitrary network packets, and anything else that comes from outside the machine to build up an entropy pool. Several security vulnerabilities have relied on naive or buggy entropy pools that affect random key generation for cryptography.

Is it always predictable, or observable, when a go-through of the seeded generator is betraying its hand?

Couldn’t a new seed be thrown in before that time?

Sorry. Just re-read about entropy box, which does what I asked above.

So I should modify the question to exclude physical jitters.

If the reseeding is done deterministically, then yes, the output is predictable. That might make it harder to do predictions, or much easier.

ETA: Also it’s important to note that randomness/pseudorandomness are properties of sequences, not numbers. There’s no real sense in which you can call a single number random.

Interesting point. I, like so many other people, say “pick a random number from 1-12,” e.g., all the time.

ETA: When you bet on a certain number on a roulette wheel, what are you doing? I know this is high-school statistics…

It would be more strictly accurate to say “Randomly pick a number from 1-12.”

This is an excellent point to emphasize and is, for the most part, true. However, there is an interesting field of mathematics which defines a form of randomness that applies to individual numbers, though only very large numbers. This was developed by Gregory Chaitin and is related to work by Kolmogorov on so-called Kolmogorov-Chaitin Complexity.

In Chaitin’s theory, individual numbers can be random, but paradoxically, it is not possible to prove that a number is random.

See, for example:
http://www.cs.auckland.ac.nz/~chaitin/sciamer.html

I think an important thing that people often get confused about is that there is a difference between ‘random’ and ‘fair’. I can’t count the times people claim that a PRNG is obviously unfair BECAUSE it is psuedo-random. That’s simply untrue. The fairness of a PRNG algorithm depends entirely on how well the algorithm is constructed just like how the fairness of dice depends entirely on how the dice are constructed.

Back in the stone age I remember some BASIC programs for my TRS-80 would use the random function *inside *the random function for its seed. Something like:



10 X=RND(RND(1))

To me this still seemed totally redundant.
Actually, because RND only returned a PRN between 0 and 1 you had to do something like:



10 X=RND(INT(100*RND(1)))

to get whole PRN numbers between 1 and 100.

There is another critical need for psuedo-random number generators. That is in simulations. Many simulations need to simulate random input, or random behaviour, often with various probability distributions (see the comment about fairness above as to why this is still random.) You may want to test things with a range of inputs, simulate truly stochastic processes, all sorts. What is always needed in these situations is a deterministic random number generator. Because the runs must be repeatable. You can’t discover a bug in a system, or perform useful experimental computation if the results cannot be reproduced on demand. So there is an entire corner of numerical theory devoted to not just good quality random number generators (which is harder than most people realise) but also to the creation and maintenance of portable (as in portable between operating system, compiler, language, and underlying machine architecture) PRNGs that always generate exactly the same sequence, so long as the initial conditions are identical. Experimental scientists that do such simulation work can get very upset if there are problems here.

Also worth mentioning are parallel random number generators. In big high performance computations you usually distribute your computations across many (and nowadays many can mean tens of thousands) of processors. If each of these components of the simulation needs a random number stream you can get into all sorts of trouble. The difficulty is that you must avoid auto-correlation of the streams. There is a very real chance that even though the PRNG algorithm used works well on its own, once there are many copies of it all spitting billions of number across thousands of machines that some pairs of streams may turn out to have some correlation (at worst one stream may turn out to repeat another stream) and the simulation may become significantly purturbed. That is a whole other bit of fun to worry about.

There are quantum mechanical random number generators that plug into a USB port and cost about $1000 or so.

I actually rely on pseudo-randomness in CG animation:

Let’s say I have a CG model of an egg, and I want to animate it so it appears to vibrate as the chick inside is going to hatch. I could do this by hand, manually setting key frames on a timeline to fake a jittery movement of the hatching egg.

Or, I can save myself a few hours and apply an operator that will jitter its position and/or rotation “randomly” by simply giving it a range of movement (x, y, z) and/or a range of rotation in degrees.

The movement it generates appears completely random and natural. Perfect. However, if I apply this identical operation to another egg next to it that is hatching at the same time, their random movements will be in sync, which of course destroys the illusion. So all I need to do is give one of the egg’s random operators a different seed number. Ta-da!

Another reason why this is good in CGI, is if I were to work on this scene on another computer, or send it to a renderfarm, the sequence of its movements will remain consistent, because the seeds values are the same.

Another angle on ‘randomness’… At my old company, we were evaluating a security piece of one of our processors. It had a security key burned into a series of fuses. The vendor was claiming the key was actually random…and unique.

A truly random process can’t guarantee uniqueness*. In security, the knowledge that there are no duplicates can make cracking efforts easier.

*Unless they screened out the duplicates, which would leave the process random, but the results not.
-D/a

This is all true, but it has little bearing on the colloquial use of the word random.

But it’s a cool thought/reply. :wink:

There’s an XKCD for that. If we were going to say that there was such a thing as a single random number, we’d have to say that the number returned by that function is one of them. But of course that function is useless, since if we ran it repeatedly, we’d always get the same result out of it (which is of course easily predictable). The sequence of numbers isn’t random, and that’s what’s important.

Oh, and another couple of sources of entropy consumer computers (without a dedicated RNG module) use are the temperature of the processor (which they monitor anyway to prevent overheating) and the “quiet” input from a microphone, if one is attached.

Huge question then: how do NSA-types generate encryptions with entropy encoders that can be decrypted?