# Random Numbers for Dummies...(me, please)

I am a math moron. Really, my brain just can’t handle it.

Knowing this, could someone explain something to me…

I play online poker, which uses “random number generators” to deal the cards.

I have heard it said that computers are not truly capable of generating random numbers.

Why would this be true, or someonethink it true? How does one define random numbers and determine whether they are truly random or not?

Can anyone explain this to me like I’m about 7?

While it’s true that computers can’t generate truly random numbers, you can create the illusion of random numbers, which is almost just as good. The computer has built within it (somewhere, I’m not exactly sure where) a list of “random numbers”, which is called by the program. Then you have to tell the program where in the list of the random numbers you want to start. The best advice I’ve seen for telling where to start is by using the milliseconds of the computer’s time. Again, not exactly sure how this is done, but the logic is that since you’re likely not going to be running on the same exact millisecond every time it’s called, it’s mostly “random.”

Computer generate random numbers which are called psuedo-random numbers. A computer cannot generate pure random numbers as they need instructions as to what numbers to generate, like a pair of dice, because computers are expected to be a consistent system. You input this set of value, you get another set of value.

Hence, to generate random numbers you give the computer a formula which produces the random number. The catch is, if the formula is the same, you will keep getting the same set of numbers over and over again.

What most game programmers do is to ‘seed’ the random number generator. It’s like changing a part of the random number formula so you get another sequence. This is also why it is possible for FreeCells or Soltiare to generate a same game over and over again as long as you know the ‘seed’. Most games and applications which need random numbers usually do it automatically, using the current date and time as the ‘seed’, so you get vastly different results.

For example, on the PS2, some enterprising players have found way to keep duplicating item drops and good game events by changing the console’s date and time to a certain value.

Oh, here’s something else which may be more helpful. It’s from the book Game Programming for Teens published by Premier Press.

The only truly random part of any computer program (if we ignore computers that do have random sources, and that excludes your PC) is the user. The user takes a random amount of time to respond to things and they run the program at random times. So the computer can use these elements as a ‘seed’ to generate numbers that appear random. But the fact is that if the user was to start the process from scratch at the same time, with the same response time, they would get exactly the same numbers. So they’re not really random.

Now if we were to delve deeper, you could argue that the user isn’t random either. They are responding to events and stimuli that dictate what they do, and if we were to repeat these exactly they would result in the same parameters generating the ‘seed’, which would end in the same numbers. Following this argument; nothing that we ever do is truly random and computers are no exception, they just make it a whole lot more obvious.

But, of course, the number of variables that influence the user are in the millions (if not billions), and the very act of attempting to repeat them would change them, so repeating them is impossible. But that’s more of a Great Debate

Modern computer systems DO have a source of pure random numbers embedded in them. The Via C3 family of CPU’s have a true RNG embedded into the CPU and Intel based chipsets since the i810 have had a RNG onboard. However, not many developers are aware of such features and the results are not cross-platform compatible so uptake has been very minimal.

Oops, that should be: some modern computer systems…

On some if not all computers, what’s built in isn’t the list itself but a formula that will generate such a list. Give the formula one number (either the last number it spit out, or, if it hasn’t been used yet, the number of milliseconds the computer’s been on or something unpredictable like that), and it will use it to generate the next number on the “list.”

The question is really what it means for a sequence of numbers to be random. According to Bruce Schneier’s Applied Cryptography, there are three notions of randomness when it comes to computer output:

1. Pseudorandomness. Sequences are said to have the property if all possible outputs show up with very nearly the same frequency in a sufficiently long sequence.

2. Cryptographically-secure pseudorandomness. These sequences are hard to predict. With the limitations of real systems that others have described very well, it’s possible to predict the next outputs of such a sequence if you have enough previous outputs, but there are techniques for making it so you need a lot of output before you can make reasonable predictions.

3. True randomness. A true random sequence generator would produce two different outputs if given the same input twice. This is what can’t be done by a computer without specialized hardware.

In general, when you hear about a random number generator, it’s meant in sense 1.

Imagine a die (one of a pair of dice). It has 6 six sides and each side is labeled with a number from 1 to 6. You roll the die and write down the number. You do this over and over and you develop a very long list on numbers. You certainly are generating numbers. And the numbers are always in the range of 1 though 6. If the die is a fair die, you will also be generating random numbers. I hope this gives you an intuitive feel for what a random number generator is.

Let’s say on your first roll, you get a 4. You don’t have enough information yet. You can’t hold 4 up to the light and say it is random or not random. Ramdomness is a property that a list of numbers has. It is not a property that can belong to single number. But this doesn’t mean that you roll the die a second time and have enough numbers. You need a lot. And the more the better. After you roll the die, say, 600 times, you have enough numbers that you could run some statistical tests. One thing you could do is tabulate the numbers up and see how many one’s you got and how many two’s, etc. If you got 400 six’s, that die has probably been tampered with. On the other hand, if you got exactly 100 of each number, that would also be odd. You could take this table and run a test called a Chi Square test to see how reasonable your distribution is. This is the easiest test for a random number generator to pass. So simply passing this test is not enough. But failing it is a disaster, a clear indication on non-randomness. And it’s not even a pass/fail kind of thing. Once I was testing a RNG that I wrote. I ran the test a fe times on 10,000 numbers and passed, but not like I expected. I then ran it a few times with 100.000 numbers and got a worse score, but still passing. At that point, I suspected a bug, and sure enough, I found it. I was generating numbers in the range of 1 to 100, but I was favoring 1 to 10 slightly and the longer I ran the generator, the more obvious the favoring was. (I don’t actually recall the range or which numbers I was favoring, but this is a real experience.)

There are several suites of test for randomness. George Marsaglia wrote a test suite called DieHard which many authorities claim to be the hardest one to pass. There are several other test suites that are easier to pass.

It is not easy for a computer to generate random numbers. Most RNG’s will take a number called a “seed” to start it up. Then it will produce a list of random numbers. At some point though, it will repeat. It is said to have a “period”. So in effect, it is a table of random numbers. When you reach the end of the table, you go back to the beginning. For a long time this was pretty much all we had. And the periods were short. And the list of random numbers was not super random. The algorithms have improved over the years. The periods get longer and longer. And the list passes more and more tests. Mostly because of the periodic nature of these generators, they are called Pseudo Random Number Generators. But the best of them are very good, passing every known test and exhibiting period that take years to exhaust. There are some extremely good algorithms, but they take a very long time to generate a random number. So you want to consider your needs and not just go for the best.

Another approach is a hardware random number generator. Some hardware random number generators are actually very poor, failing many tests for randomness. Despite this, they are often called True Random Number Generators. That Intel chip is very good and it passes DieHard. I knew it was in the chipset, I didn’t know it had been embedded in the cpu, but I guess it was just a matter of time. There is another hardware RNG somewhere that passes DieHard. But most True Random Number Generator do badly when tested… I would not use one without testing it.

You know, I would have bet that a roulette wheel is damn fine random number generator. But I saw a show where some guy watched roulette wheels and saw that they favored some numbers. He won a small fortune before the casinos threw him out. So I guess that I was wrong about roulette wheels. Well I will still bet that dice are damn fine random number generators, but I won’t bet more than a dollar. So maybe my example in the first paragraph isn’t that valid. I don’t really know. Generating random numbers is not as easy as it seems.

Not to beat a dead horse, but here is a quote from George Marsaglia himself. It explains one of his advanced tests for randomness. The math gets a bit deep in places for a 7 year old, but I think that everyone can grasp what the test is showing and how a weakness of a RNG that seemingly looks very good is exposed by the test.

From Computers & Applications, 9, 1-10, 1993:

If I’m not mistaken, things like this are truly random.

“The Heisenberg Uncertainty Principle suggests that you cannot measure, with perfection, any chaotic source.”

FWIW, chaos is deterministic, not random.

I think the Heisenberg Uncertainty Principle would say you can’t perfectly measure any system, whether it’s chaotic, random or just sitting there doing nothing. That is, the uncertainty principle isn’t limited to applying to certain types of systems.

Online poker sites claim to use entropy given by using input (mouse movements and such) and thermal-noise based random number generators. They’re really expensive devices that basically detect miniscule amounts of heat change that naturally occurs more or less randomly.

Or they’re marketing hype dreamed up to get people to play possibly rigged games.

Just because we have cryptographically-secure PRNGs doesn’t mean anyone is obligated to actually use them, especially when they stand to gain if they don’t.

Why would you suspect that they aren’t? PartyGaming Inc recently floated for something like \$10bn USD so clearly the online gambling industry isn’t starved for cash. IIRC, decent quality, commodity true RNG’s go for around \$10,000 - \$100,000 which should be well enough for even the most demanding applications. Seems like it would be a no-brainer for online casinos to use true RNG’s and stop the possibility of hacking.

Unless they didn’t have the money, weren’t legitimate, and were seeking quick ways to raise cash. But I guess that’s what Derleth said.

On re-read, I just realized that you were quoting from the linked page. They’re admiting their data source is chaotic (i.e. deterministic, not random) but the fact that they are unable to measure it precisely introduces enough error to make it a better pseudo-random than if you could measure precisely. Very clever.