Lottery and Ping Pong Balls -- Truly Random?

ERNIE, the machine used to select the UK premium bonds winning numbers, uses radioactive decay, I think, for the random factor.

This is getting into the realm of metaphysics rather than physics, but my understanding is that Mathochist is physically incorrect.

There is no way - even in principle - to precisely predict the outcome of a chaotic situation, i.e. one in which small changes in the original parameters lead to large changes in the results.

And the lottery example is a chaotic situation to an even greater extent than his pendulum example. No measurement can be made to sufficient precision to run a deterministic model of the events. And I don’t see how QM can possibly be ignored. The precision needed to forecast the outcome would necessarily be such that QM uncertainty would confound the numbers.

I can’t see how what Mathochist is saying is different from: if you take out everything that makes it indeterminable, it becomes determinable.

Can you try your explanation again?

Well, you’re right that there’s an amount of philosophy to it. I’ll try to restate what I mean.

Let’s say for the moment that classical mechanics is correct: Newton’s deterministic models and all.

If one could know the exact position and momentum of every particle in the lottery box (or the double-pendulum), one could exactly predict the future behavior. The fact is, though, that one cannot get this precise information. How bad is this?

Consider two pendula with slightly different starting conditions. Maybe both are started by dropping the bob from slightly different angles. For most choices of initial position and momentum for one pendulum, there is some (possibly very small) collection of “nearby” initial conditions whose future behavior stays close. That is, even if you don’t know exactly the initial conditions, your close measurement will stay close enough as time goes by.

Let’s go back to the double pendulum: now the situation is radically different. Given any initial condition A and any “distance” (in an appropriate sense), there is another initial condition B within that distance that behaves radically differently. Thus if you took a measurement with a margin of error greater than the distance between A and B you wouldn’t be able to tell which state the system started in, nor to predict how it would behave in the long run.

This is the point that keeps getting lost in discussions of chaos: the systems themselves don’t have to be that complicated or random. In principle, the lottery box is a perfectly well-behaved deterministic system which has sensitive dependance on initial conditions. To a perfect observer it’s merely computationally difficult (but in principle possible) to predict the future behavior from the inital conditions. To us imperfect observers it’s impossible to predict because any error in measurement leads to huge differences in future behavior.

To carry further what ftg said: There’s a big difference between being truly random, and being unpredictable. If you took any set of lotto balls, and ran them through the machine a zillion times, it’s quite possible that a meaningful pattern would emerge. Minute differences in ball size, shape, weight, and surface roughness could bias the results in very subtle ways.

But what is important in Lotto is that the outcome is unpredictable, even if the ball set will not produce truly random results. To this end, the balls are only used once (or maybe a handful of times) before being retired. They are also carefully inspected after each drawing to make sure that they are within accepted tolerances for shape, size, etc. There are numerous sets of balls, and they are picked randomly. Therefore, the public never gets to see the ball set used enough time to draw any statistical conclusions of bias.

In slot machines, roulette wheels, another mechanical devices that need to produce random results, a number of techniques are used. The first is to use a high-quality RNG with a period large enough that it is almost impossible to statistically determine patterns that can be exploited. The second is to (sometimes) introduce a truly random seed through some physical process - the time delay between pulls of a handle, noise on the AC line that powers the machine, etc.

A couple of fun stories about screwups in generating random numbers:

The first involves an online poker site called “Planet Poker”. Early versions of the software for that site used a stock random number generator from, I believe, Borland’s C compiler. This particular RNG only had a period of 2^32, or about 4 billion. But the developers also initialized it with a call to Randomize(), which uses the number of milliseconds from midnight. That means there were only about 86 million possible decks (when the real number of possible decks is 52!, or about 2^226).

86 million possible decks is not that many. So some enterprising cheats wrote a program to pre-generate all 86 million possible decks and store them in a database (actually, due to some other flaws in the way the deck was shuffled, the true number of possibilities was only a few hundred thousand). Then, as the game starts and cards are shown, the program can start filtering out decks that don’t match the pattern of cards seen. After seeing about five cards (the cheater’s own two plus the three on the flop), the cheat program can almost always narrow the possible decks down to one, at which point it know which cards everyone else holds, and which cards will come up on the turn and river. Profit$$.

Another interesting flaw occured at a Canadian casino. The casino owners purchased an electronic keno game from a reputable American supplier of such games to the casino industry. Unfortunately, this particular machine reset its seed every time it was powered up. This was never an issue in American casinos which are open 24 hours a day. But in Canada, the casinos at the time closed every night. So each day, the machine was powered on again, and started with the same seed. An enterprising math student who happened to be there at opening time happened to notice that the first game of the day always had the same result. He bet on it and won a fortune. Then he got greedy, and did it again. The odds on the same guy winning a keno jackpot were so astronomical that the game was shut down and an investigation ensued. The player was eventually charged with cheating and was told to give all the money back - however, he won in court. The court ruled that he had done nothing wrong. He had simply observed the game, and bet based on his observations. No manipulation or trickery was involved, and it was the casino’s job to make sure their games were fair. Needless to say, the game was taken offline until the problem could be fixed.

There are all kinds of statistical tests that can be performed to measure how random a set of random numbers is. For example, if there are 50 balls, you would expect each ball to come up first about 2% of the time. If, over a thousand drawings, the number “1” comes up first 50 times, there’s an indication that there might be a problem.

I suspect that most big lotteries employ statisticians to run tests on the winning numbers to make sure that they are random.

I agree to the extent that a physical process does seem to be more reassuring on that point. If done properly though, a physical process is satisfactory IMHO.

Although I remember my Statistics in Manufacturing professor in college pointing out that the United States’ draft lottery during the Vietnam era has been subjected to statistical analysis and has been determined to have been biased. Apparently the balls or chips or whatever they used were not sufficiently mixed.

Oh, I don’t know. The population of lottery players probably overlaps the population of slot and keno players, and they don’t seem to mind the computer-generated nature of the wins. For that matter, video lottery terminalss are a huge addiction problem in some place.

If you drew the lotto by having some jumbotron spin the numbers up around animated rockets or something equally spectacular, I’ll bet the public wouldn’t care one bit that it was computer generated.

The Clockwork Universe was very popular for a couple of centuries after Newton.

But the uncertainty principle killed it absolutely dead 75 years ago. I’m still unclear why you would bring it up. There are no perfect observers and there can be even in theory no perfect observers.

In practice, the computional impossibility is enough to provide statistical randomness without calculating QM effects. Your initial conditions are always and forever unknown to a sufficiently precise degree, so even a single outcome cannot in practice be calculable, let alone a succession of them. The underlying QM nature of the universe compounds this and makes the impossible more impossible.

That’s exactly why I brought up the deterministic limit: because unpredictability can be shown even within that system without bringing up quantum theory at all.

You know, the great thing about the SDMB is that I can read a comment, say, “I wonder if that’s true?”, do some research, and come up with something completely new. To wit: the Poincaré-Bendixson Theorem states that chaotic behaviour can’t occur in a system if the dimension of its phase space is less than three. Since the phase space includes both position and momentum degrees of freedom, though, all this is saying is that a one-dimensional system can’t exhibit chaos. And even that goes out the window if you add time-dependence of the equations of motion (e.g. a pendulum with a periodic driving force is chaotic.)

Note that having a phase space dimension of three or greater doesn’t guarantee chaos, though; you also need some non-linearity in the system, and even then it’s not certain.

A damped driven pendulum can be chaotic, depending on the parameters. I used up tons of my undergraduate university’s computer time making movies of period-doubling cascades for Jim Yorke back when I thought I might make my name in chaos.

Anyhow, it’s a great example of a perfectly simple classical mechanical system that still is completely unpredictable for certain parameter values.

Another reason they still use ping-pong balls is because it’s more fun than using computer-generated numbers. It’s just a little more entertaining to see the balls bounce around, giving you short glimpses of “your” number, before the final number pops up into a hole. Nobody’d be amused by a camera zooming in to a computer screen as a hand reaches to the keyboard and presses “enter”.

And, part of the draw for the lottery is entertainment. That’s why most of the games, particularly the scratch-off games, have wild and wacky names and themes. They could just as easily print gray tickets that say “Win” or “Lose” on it. But, that wouldn’t be as much fun.

I once made $800.00/mo. for 3 months using the non-randomness o f a fantasy 5
daily lottery. I won’t mention the state. Nothing is truly random. The balls aren’t exactly the same precise weight, numbers with 2 numerals may have more ink weight than others and be heavier. Or, if the numbering is done with a laser, some numerals may be burned a little deeper than others and be lighter. Anyway, a sure way to tell is to take a page out of any college undergrad’s statistics book and do a histogram on the winning numbers of about 200 drawings. In this particular lottery 4 numbers occurred 3 times as often as the others, 4 other numbers occurred 2 times as often, and 3 or 4 occurred half as mych as the norm. This went on for a while before it was fixed. Either somebody connected with the lottery was playing games with the balls, or just possibly sloppy management. Anyway , it got fixed and there went my easy money!

Just for the record:

http://en.wikipedia.org/wiki/1980_Pennsylvania_Lottery_scandal

I’m kind of surprised “don’t bother to turn the machine off at night” didn’t come up as solution #1.

Undead nature of the thread aside, some very simple googling will reveal measures that various lotteries use to guarantee randomness, including changing out the balls themselves and using different machines. For example, here’s a page showing previous results from the Texas Lotto, which includes a code designating which particular machine and which set of balls was used for each drawing. There are also a set of pre-tests to ensure the balls are good and actually randomized.

As for the other point, with as few samples as 200 drawings, some numbers will naturally crop up more often than others. If there are 49 numbers, that’s only an average of 4 appearances per number.

It would not be surprising at all to see some numbers show up 3 or 4 times as often as others. For example, if a number only showed up twice (half the average) and another showed up 8 times (twice average), you already have the situation that one number shows up 4 times as often as the other.

If, after 200 drawings, each number showed up pretty close to the average number of times, I would actually suspect a cheat. It’s statistically unlikely. That would not be the case after 2000 drawings, though, when you get enough of a law of large numbers effect to really push things to the average.

Winning $800 a month for 3 months? Also not surprising it happens to somebody through sheer luck. Confirmation bias is a dangerous bias.

Think about flipping a coin 200 times. There are a couple things I can almost guarantee (vernacular “guarantee” rather than mathematical “guarantee”).

  1. You will NEVER see exactly 100 heads and 100 tails. That is, some numbers showing more or less often than average is perfectly reasonable and to be expected. And, in fact, with fewer drawings, more variation from the average is to be expected. For example, with 8 flips, seeing 6 heads/ 2 tails (i.e. 3 to 1 ratio) is not very surprising. But with 80 flips, seeing 60 heads/20 tails is surprising.

  2. You will see a run of at least 6 in a row (either heads or tails) at some point in those 200 flips*. That is short term variation in long term behavior that seems non-random is actually a pretty good indicator of randomness. If you don’t see such behavior (i.e. you see, at most 2 or 3 heads/tails in a row), it’s almost sure to be a cheat.
    *Aside: this is a good example of Benford’s Law and a pretty neat way of physically demonstrating the seemingly paradoxical behavior of truly random events. Here’s a recounting of that trick used to demonstrate that point to some college students. The same principle can be used to judge whether polling results have been faked, as Nate Silver has often demonstrated at 538.

Threads coming back to life is a random event.

Assuming,let’s say, we are flipping a coin and in each iteration, there truly is a 50-50 chance of heads or tails… Mathematically, there are 2^100 possible outcomes (2^10 ~ 10^3, so about 10^30). Only one of these is all heads, and only one of these is all tails.

the odds of 50 heads and 50 tails is, if we don’t care about order, 100!/(50! 50!)
The odds of 49 heads 51 tails (or vice versa) is 100!/(49! 51!)
The ratio of these is : (49! 51!) / (50! 50!) = 51/50 - not too bad.
But if we get to, say odds of 46-54 tosses vs. 50-50 : (46! 54!) / (50! 50!) which gives:
(51x52x53x54)/(47x48x49x50) = 1.373, or about 7:5

the odds of 45-55 (versus the odds of 50-50) are about 1.64
the odds of 44-56 are about 2:1

you cans see it falls off quickly away from the mean.

(Assuming I’m right. Math was never my strong suit, and combinatorics was especially confusing).
And so on…

Your analysis is pretty much spot on.

Of course, with a coin flip, 50 heads/50 tails is still the most likely outcome. But getting EXACTLY the average of 50/50 is more rare.

The probability is, as you noted, C(100,50)/2^100 = 7.96%.

For 10 flips, getting exactly 5 heads is 24.6%

In the example I gave (200 flips), getting EXACTLY 100/100 has a probability of 5.6%. That’s not really impossible or even all that unlikely (about 1 in 15 or so) but it’s not very likely either.

On the other than, getting in a certain range (say 40-60 heads) is very likely.

This is leading to a law of large numbers discussion.

If you have 10 flips, the probability of getting 4-6 heads (i.e. 40-60% heads) is 65.6%.

If you have 100 flips, the probability of getting 40-60 heads (i.e. still 40-60% heads) is even higher at 96.5%).

So, with increasing flips, the probability of getting EXACTLY the average outcome (i.e. half heads and half tails) actually gets smaller. But the probability you fall within a small range about the mean (say between 45 and 55% heads) gets larger with more flips.

But also notice that in that run of 200 flips, you are also still virtually guaranteed to see a run of 6 in a row. It seems paradoxical, but that’s the math.

Where does the law of large numbers come into play? Well, let’s say you flip a coin 1000 times and it comes up heads 600 times. Now, let’s say you flip the coin 999,000 more times (to give us a round million) but these flips comes up 10000 more heads than tails.

For the 1000 flips, we have 200 more heads than tails (600 vs 400) and a heads/tails ratio of 60/40.

For the 1 million flips, have 10,600 more heads than tails (505,300 vs 494700) and a heads/tails ratio of 50.53/49.47.

So, even though there is now an even wider variation between the number of heads to tails, by percentage, things are much closer to the expected 50/50. That’s basically the gist of the law of large numbers.

The same is true for the lottery ball distribution. With only 3 months of drawings, it’s not hugely surprising to see individual balls occasionally fall outside an expected distribution, even though the overall behavior remains consistent. But with several thousand more drawings, even if ball outcome variations increase, I would expected the overall percentage to come closer to the expected range.

There are a ton of people who claim to have secret “systems” which use those discrepancies in actual versus expected behavior to “predict” that some balls will show up in the near future with higher probability. They’re pretty much grifters taking advantage of the fact that most people don’t understand the law of large numbers.

The 3 month “artifact” mentioned above certainly has the feel of this.

The reality is that if there WAS a pattern to ball selection, someone would have already found it and would use it to get rich. Has this ever happened?

Well, sort of. The lottery scandal linked above shows a group of people colluding to rig the odds and pick numbers accordingly.

And there’s always the very basic argument against why anybody would choose to come forward if they did find such a method - if they get rich based on some clever trick, they’d be a fool to announce the method and should do everything possible to avoid being caught. It would be better if they posed as just another lucky random winner.

I doubt such a method exists, but the absence of evidence is not the same as the evidence of absence.

They would be smart to do this while they can, but if they continue winning, eventually someone running the lottery will catch on, and will find and fix the problem. How long could they keep it up before being found out?

Once the problem is fixed, there’s no need to keep quiet any more. Heck, they could get a good book deal. More profit!