Is a random number generator still random if it is streaky?

Whack-a-Mole · February 22, 2008, 6:58pm

I participate in an online game which makes use of a computer run random number generator to determine the chances of various things happening in the game world.

While the following is somewhat anecdotal I can say I rarely find anyone in-game who has not experienced it. Some few have actually logged numbers and done the math and here is what they found:

Overall, with a statistically large enough sample size, the random number generator does produce results that trend very close to the stated percentage chance advertised (so a 50/50 chance when number crunched with real examples comes out to 49.7/50.3 after 1000 attempts). All well and good and pretty much what we would expect from a random number generator.

However, there are some shockingly unlikely streaks in the datasets collected so far. Certainly one might expect some unlikely runs (say 20 heads in a row on a coin flip) but in 1000 attempts one would not expect to see those 20 in a row appear several times. Some of the data one person (others have done this too and bear out the same results roughly):

—Data—
Marketing missions executed: 1543 (95% courier/5% kill missions)
Number of Kill missions: 79
Number of courier missions: 1464
Percentage of kill missions: 5.12%
Percentage of courier missions: 94.88%

Most anomalous streaks:

8 kill missions in a row
Chance: 0.00000000390625%
Frequence: 2x
5 kill missions in a row
Chance: 0.00003125%
Frequence: 5x
96 courier missions in a row
Chance: 0.73%
Frequence: 1x

AND

Attempts: 900 (50/50 chance of success)
Successes: 467
Failures: 433
percentages of successes: 51.9%
percentages of failures: 48.1%

Most anomalous streaks:

16 failures in a row
Chance: 0.0000439%
Frequence: 3x
8 failures in a row
Chance: 0.287%
Frequence: 5x
6 successes in a row
Chance: 1.95%
Frequence: 8x
—End Data—

So, is the above simply to be expected and normal? I might expect to see something like that occasionally but the odds seem distinctly against such streaks occurring repeatedly to the same person (and many others see this too). Is the random number generator glitchy even if overall it provides the percentages it claims to?

Triskadecamus · February 22, 2008, 7:03pm

Truly random cannot be truly average.

That doesn’t mean your generator is perfectly random, though. If none of the “anomalies” had occurred, it would definitely not be random. If “too many” occurred it might be non random, but then again, it might not. There is no Goldilocks answer. Just right would be just wrong.

Tris

Indistinguishable · February 22, 2008, 7:08pm

I’m a little confused by the chance numbers in your second part (with the 50/50 success/failure probability). How did you calculate them?

Whack-a-Mole · February 22, 2008, 7:17pm

I honestly do not know since they are not my numbers. I copied the data from a thread on the forums for this game (linked below as source in the quote). To save you from reading that if you do not want to I’ll post a bit of his thinking (the numbers I posted came from post #33 on the second page of that thread). Also note that while this thread is several months old there have been several other threads on this topic and one that currently looks to be epic in length where math is being tossed about. I’ll link those too if anyone really cares but by and large they are along the lines of this.

"That can be easily proven mathematically, but it is quite intuitive. When you have 50% of chance of succeeding in something, if you try enough times you will succeed in almost exactly half of them. In a similar way if for a big enough sample you succeded half of the time the base chance for a single attempt is 50%.

With that in mind the chances for having the same result repeated n times in a row is:
p^n

Where p is the probablity of that event happening. That comes from the binomial distribution that rules sets of discreet independent events like these.

So for a binary example like ours (success/failure) the chance of failure of 8 attempts in a row, supposing success ratio is 50% is (0.5)^8 = 0.39%, which albeit possible is very unlikely. The chances of 16 attempts failing is: 0.0015%

The chances of a marketing agent (95% courier / 5% kill) to offer you 10 kill missions in a row is in the order of 10^-13."

SOURCE: EVE Online | The #1 Free Space MMORPG | Play here now!

Indistinguishable · February 22, 2008, 7:30pm

Right, the numbers given in that quote are what I would’ve expected (8 failures in a row as about 0.39%, 16 failures as about 0.0015%), rather than what’s in the OP.

Blaster_Master · February 22, 2008, 7:41pm

Ah, one of my favorite pet topics, streaks SHOULD be expected in a random number generator, assuming these are independent events. We have these intuitions that streaks shouldn’t happen in random numbered sets, but it’s bound to happen. If you flip a fair coin and it’s come up heads 10x in a row, the chance it will come up heads the 11th time is still 50%.

Here’s an example I think that really helps. Take a piece of paper and a pencil and try to make a random distribution of dots. Chances are you’ll want (at some level) to see the dots reasonably spread out, this is not random. A truly random set will be very likely to have some spaces that have a high concentration of dots and other spaces that have a few dots, and these spaces themselves are random. Another experiment, write down a random sequence of numbers, chances are, especially if you’re pulling them from your head, that you’ll have very few clusters, but if you pull a random sequence from random.org, you’ll see several more clusters.

Now, streaks are also expected in a random number generator that ISN’T truly random, and the only way to do that is to actually run statistical analyses about how it acts over various intervals. Bottom line, the VAST majority of random number generators are not truly random, as truly random is extraordinarily difficult to acheive, but most simulate it well enough, that it doesn’t matter when combined with the other variables for most applications.

And, as for statistical anamolies… well, those are expected as well. For instance, if you flip a coin a million times, the chance that it comes up heads every time is an enormously small chance, but IS possible even with a fair coin. Now say you flip it a very large number of times, the probability of a million head sequence approaches 1 as the number of flips approaches infinity. IOW, statistical anomalies, even ones of exceedingly low probability WILL eventually happen over a large enough sample.

Of course, some of those streaks seem a little improbable to occur multiple times in such a small set. Is it possible there’s other dependencies that are not accounted for?

Whack-a-Mole · February 22, 2008, 7:43pm

Just a WAG here on my part but perhaps he is calculating not the chances of one occurrence but of that same thing occurring three times (or whatever it was) in that sample set.

Either way I guess that would be another way for me to pose the question. That streaks happen in a random series I get. But what are the chances of the streak re-occurring after a given number of tries? And how often can we say repeating streaks is fine and the random number generator can be considered to be truly random (as far as any computer generated number can be said to be random…no need for that discussion here)?

For instance, if we ran a series of 100 tries with a 50/50 chance of a 1 or a 0 coming up I doubt we’d call the generator random if it produces fifty 1’s then fifty 0’s even though it averaged to a 50% chance overall (I know it could but very unlikely and particularly if we run the test many times and see it re-occur with some regularity).

ultrafilter · February 22, 2008, 7:45pm

In a random (i.e., incompressible) binary string of length n, you should expect to see runs of length log[sub]2/sub. log[sub]2/sub = 10.6, so as long as you’re splitting data into two categories, you should expect to see 11 of the same category in a row at least once.

Bobotheoptimist · February 22, 2008, 7:45pm

Didn’t von Neumann say - “Anyone who uses arithmetic methods to produce random numbers is in a state of sin.”

Whack-a-Mole · February 22, 2008, 7:51pm

Cool…interesting to know how that is calculated.

However, note that the sample set with 1543 tries the chances were not 50/50 but rather 95/5 so in that light the streaks with the 5% chance are more improbable.

Whack-a-Mole · February 22, 2008, 7:54pm

So when assessing a computerized random number generator how is it judged “good enough” understanding it will never be totally random? I agree for most purposes very nearly random is fine but still there should be some criteria to determine when it suffices for the task at hand.

Jragon · February 22, 2008, 8:00pm

Well, really no random number generator is “random” at least not one dictated by computers (that I’ve seen). Generally how it works is it will take a number from the microporcessor’s clock, put it into an algorithm, shave off a few bits and spit out your number.

http://www.thedryeraseboard.com/compsci/algorithms/randomnumbers/

It’s not a long page so i’ll jsut quote it

Ever wonder how your computer generates random numbers? Does it flip 16 coins deep deep inside the processor to give you your 16-bit int? No, not exactly. The actual process is quite simple. First start with some 32 bit number. Then each time you need a random number, multiply the number you already have by 214013 and add 2531011. Then return the first 16 non-sign bits of that. Take that new number and repeat that process each time you need a new random number.

This of course gives you the same random numbers each time you run a program, though. Therefore you need to “seed” the randomizer. When you seed the randomizer, you are giving it a different starting number. This number is taken from the microprocessor’s clock.

This random number generating algorithm is used in most modern computer systems. Now you have something to talk about at parties.

So this is teh expanded version of above.

However, in spirit of your OP and assuming we have a mythical random number generator, streaks (or what appear to be) are to be expected occasionally in a truly random system (where potentially millions of people are asking for something random every second or minute).

Let me actually rearticulate that last point, there are tons of people doing random number requests in a row. What you think is you asking the server 20 times for some number is really you asking requests 1, 7, 12, 27, 93 etc so you’re not gettinga complete picture of the results.

Chessic_Sense · February 22, 2008, 8:01pm

If you have 1000 numbers in a row, where the choices are 1 or 0, there is a .5^10 that you’ll get a streak of 10 in a row. But the part that you’re missing is that there are 990 tries to do that.

ZenBeam · February 22, 2008, 9:28pm

The OP is right, those streaks are not what would be expected from a truly random process. The probabilities quoted are correct for getting a streak of length N in exactly N tries. For example, for 8 kill missions in a row, if you run 8 missions, the probability of all 8 being kill missions is 0.00000000390625%. For 1543 missions, the probability is higher, but certainly not more than 1543 times higher, which would be 6.0273E-06 percent. Since a run that length happened not once, but twice, the RNG is almost certainly flawed.

Jragon · February 22, 2008, 9:44pm

Rereading it you and he are right. All I know is in many games, certain quests will eb given more “priority” than other.

I.E. Given a “radom” number between one and 100
1-60 will be “kill quests”
61-80 will be delivery quests
81-100 will be something else (if applicable)

It may be because the developers designate one as more fun than the other. As such though, I wouldn’t say they’re not “random” though, take one of those wheels with a spinner in teh middle (think twister). There’s simply a larger “area” for one type over another. It’s random, it’s just not a perfect 1 in x chance.

To use an example from D&D, if you have an armor class of 16, you still roll a d20 assuming no bonuses you simply have a window of 5 numbers in which you’ll hit. noone would argue these dice rolls aren’t random, just a larger window is allowed, allowing long streaks (when reduced to “did” or “did not” hit) to occur more frequently.

Indistinguishable · February 22, 2008, 9:46pm

There is an interesting thing which happens with judging a random number generator to be flawed or not. I mean, if the OP told us the entire sequence of successes and failures he observed in part 2, we could say “Oh, that was an event of probability 1/2^900. Incredibly unlikely!”. But we could say that no matter what the sequence was…

As it happens, we find some kinds of events more significant than others, as indicated by our willingness to make certain inductive inferences (but not others); that is to say, in our mind, our prior probability distribution, so to speak, for the sequence of numbers output by the generator is such as that observing a long streak of successes does increase our confidence in a following success, or other such things (when we observe the sorts of patterns that strike us as significant, we begin to assume that the generator has been engineered in such a way as that it will follow those patterns, rather than engineered in such a way as that it should behave ‘randomly’). This allows us to say “Oh, yes, given the observations so far, I am no longer willing to model this generator’s activity as ‘random’ (i.e., given by independent draws from a Bernoulli distribution).” But it’s not quite as simple as saying “Oh, something happened that, had this been a random distribution, was extraordinarily unlikely”, because, well, no matter what happens over many trials, the result is one with extraordinarily low probability of being given by a random distribution.

Or, as Feynman put it: “I had the most remarkable experience this evening. While coming in here, I saw license plate ANZ 912. (Calculate for me, please, the odds that of all the license plates in the state of Washington I should happen to see ANZ 912.)”

ZenBeam · February 22, 2008, 10:11pm

I tried coding this up in MatLab, and ran some simulations with P(kill) = 0.05, and 1543 events to see how long of a streak I would get. In a hundred runs, I got a maximum streak of 2 85 times, got a streak of 3 14 times, and got a maximum streak of 1 once. Never got a streak of 4.

Doing a sanity check, that’s roughly what you’d expect. Out 1543 missions, you’d expect about 77 kill missions. Given 77 kill missions, you’d expect the next mission would be a kill mission about 4 times. Given 4 kill missions, you’d expect about a 20 percent chance one of them would be followed by another kill mission. This isn’t the right way to estimate the probability, but rather is an over estimate, so getting a streak of 3 14 times out of a hundred is believable.

Indistinguishable, the problem you’re describing isn’t really relevant here, any more than if I said I flipped a coin and got heads 100 times out of 100, and was asking if you thought my coin was fair. Sometimes RNGs really are broken.

Indistinguishable · February 22, 2008, 10:22pm

Of course they sometimes are, and I even outlined on what basis we can take a generator’s output and use it to deduce “This probably isn’t random”, using our usual inductive logic. I’m just pointing out the curiosity that it’s not enough to simply say “Oh, that output which I observed is one which, had this been random, would have low probability of occurring”, in and of itself.

Autolycus · February 23, 2008, 2:09am

Since he saw it, wouldn’t it be 1? Am I missing the joke here? Stats makes my head hurt, but I try to force myself to read about it.

Indistinguishable · February 23, 2008, 2:36am

I’d say you’re kind of ultra-getting it, actually. Feynman makes the comment in the middle of a discussion of the meaninglessness of “calculating probabilities after the fact”. Well, I wouldn’t quite say it’s meaningless, but there is a danger in misunderstanding or overstating the usefulness of doing so.

Conditioned on the knowledge that Feynman would see ANZ 912, then, of course, the probability that Feynman would see ANZ 912 is 1. However, without conditioning on that knowledge (i.e., looking at the “prior” probability distribution), the probability of seeing ANZ 912 is ridiculously low. But the fact that he did end up seeing ANZ 912 doesn’t, in itself, cause us to say “Oh, that can’t be a random license plate” or any such thing. For whatever reason, it’s not the particular kind of low-probability event which causes belief revision; it doesn’t strike us as significant.

Topic		Replies	Views
Can humans create random numbers? Factual Questions	67	10483	January 26, 2009
Do we know how the human brain generates random numbers? Factual Questions	36	8549	December 13, 2010
Is it true that computers can't generate true random? Factual Questions	38	15608	August 11, 2010
What's the difference between a random number and a pseudo-random number? Factual Questions	62	21365	December 27, 2011
Another math problem (measuring streaks) Factual Questions	18	746	May 8, 2021

Is a random number generator still random if it is streaky?

Related topics