Probability Question

OK, you number crunchers should be able to do this one with your temporal lobe anesthetized:

A container contains 52 balls, numbered, not surprisingly from 1 to 52, thoroughly mixed. Five balls are removed from this pool without replentishment, at random. What is the probability that any two of the five balls will have adjacent numbers? Ditto for any three of the balls? Ditto four and five balls?

The reason I’m asking relates to lottery games and the true “randomness” of numbers “picked by the machine”. Though I don’t play these games often, I have been astounded (and therefore slightly suspicious) at how often a “machine picked” ticket I have bought contains two adjacent numbers. Although I haven’t been keeping records (so far), my “recollection” is that about 7 tickets out of every 10 contains an adjacent duo and about 2 in 10 contains a triplet. I have gotten a quad and a quint each once in about 100 buys. Unless the above calculation (for two) approximates 0.7, I am forced to conclude that something is wrong in their random picking algorythm, and if so, the deck is being stacked against the random players. Who keeps these games honest anyway? Can Joe Citized demand to see audit records?

I’m not particularly confident of my calculation, but for two balls out of five I got odds of around 35%. It gets more complicated for the odds of three balls, so rather than working harder I gave up.

I tried again, and rather than doing it the right way, I simplified it a little. 35% should be accurate for two balls, and I was able to get estimates that are less accurate for the others:

2 balls: 33%
3 balls: 3.8%
4 balls: 0.3%
5 balls: 0.01%

I did these without taking into acount the fact that the same ball can’t be drawn twice, so they are slight underestimates of the real values. I also may have ignored a few other things, but these values should be close.

I was bored, so I kind of decided to stomp on this problem a little bit…anyway, suppose we let CN(n,r) be the number of ways to pick r elements from a set of n, without choosing any two adjacent elements. There are two cases: either the first element is chosen (in which case there are CN(n-2,r-1) ways to choose the remaining elements) or else the first element is not chosen (in which case there are CN(n-1,r) ways to choose the r elements from the rest of the set).

So CN(n,r) = CN(n-2,r-1) + CN(n-1,r). This lets me compute a generating function for CN, blah blah blah, anyway with a little help from Maple we get CN(52,5)=1712304. Since there are 2598960 possible choices in total, that works out to a 34.1158…% chance of getting at least two adjacent numbers. Not quite the same result as quelquechose but in the same ballpark, and he’s estimating anyway…

(And if anyone’s really interested, in general CN(n,r) is the coefficient of a[sup]n[/sup]b[sup]r[/sup] in the Taylor expansion of (-ab-1)/(ba[sup]2[/sup]+a-1)…)

So, if I consistently get “twins” on 7 out of ten tickets (as I believe I have been), is somebody yanking my crank? To whom do I complain? Could I possibly expect a “Freedom of Information” request to give me accurate statistical information on past “games” or will they just claim “All information except that related to winning tickets is purged after each game.”

I came up with about 78% probability of getting at least one adjacent pair. I oversimplified a bit, though. I made the simplifying assumption that 52 is adjacent to 1. I also made the assumption that, if you already picked two non-adjacent numbers, there are 4 distinct numbers you could pick that would be adjacent to one of them. That’s not really true, because if the first two numbers were, for example, 5 and 7, there are only 3 numbers that would be adjacent to one of them. But, given those assumptions, the probability of drawing NO adjacent numbers would be:



49   46   43   35
-- * -- * -- * --
51   50   49   48


which is about 22%. So the probability of at least one adjacent pair would be 100% - 22% = 78%
BTW, why would you think they would be giving you a non-random ticket? What’s in it for them?

Not necessarily. It depends on the number of tickets you’ve observed. For example, if there’s a probabililty of 34.11% or so that one ticket will contain two or more adjacent numbers, and if you buy 10 tickets, then the probability is about 2.25% that at least 7 of those tickets will have two or more adjacent numbers. Which is not outside the realm of possibility; you may have simply gotten, um, “lucky”.

If you’ve observed this phenomenon on more tickets, of course, the probability of so many adjacent numbers drops.

Someone familiar with the laws of your state will have to answer that one. But unless there is specific advertising by the lottery to the effect that the machines operate completely randomly, I suspect you have no grounds for a complaint. Let the buyer beware, and so forth.

Why is that last term 35/48 instead of 40/48?

And even if it is 35/48, I get about 0.56 for the product of those four fractions, not 0.22. (If you use 40/48, the product is about 0.64, which is much closer to my exact answer.)

A simple take:

Say that after 4 balls are drawn that none of them are adjacent numerically to each other. There are now 48 balls left, of which 6, 7, or 8 of them are adjacent to one of the first four (4 balls, 2 neighbors each, minus 1 each for #1 and #52). That gives a probability of 12.5% to 16.7% that there are two balls adjacent numerically in a lot of five. Not most probable, but also not that unlikely.

For the chance that all 5 are in series: The possible series would be [1 2 3 4 5] [2 3 4 5 6] … [48 49 50 51 52]. Forty-eight series divided into 2,598,960 (the total number of lotto combinations) gives 1 in 54,145, or 0.0018469%.

There are actually quite a few people that play [1 2 3 4 5] (or similar combos) in the various lotteries because they know that one combo’s chance of occurring is just as likely as anothers. But because there are multiple pickers of those combinations, if ever they won, they have to split it so many ways that it wouldn’t even be a $1,000,000 jackpot for anyone.

So stick to you children’s birthdays. :slight_smile:

That estimate’s too low. Let’s simplify a bit: suppose that we’re only drawing three balls, not five. By your reasoning, the probability of two adjacent numbers is no greater than 4/50 or 0.08. But we can explicitly count the ways to get exactly two adjacent numbers: there are 51 possible positions for the pair, and either 49 positions for the singleton (if the pair is at the beginning or end) or 48 positions for the singleton (if the pair is in the middle). That adds up to 49+48*49+49=2,450 possibilities out of 22,100, which is 0.1108 or so.

The number of ways to get exactly two adjacent numbers is 778,320. I get that in the following way: instead of choosing 5 numbers out of 52 with two adjacent elements, I choose 4 elements out of 51 with no adjacent elements. (Imaging the pair merging into a single element.) There are 194,580 ways to choose 4 elements out of 51 with no adjacent elements, and for each of those possibilities there are 4 corresponding ways to choose 5 out of 52 with exactly one pair (depending on which of the 4 elements “expands” back into a pair). Finally, 194,580=778,320.

Anyway, out of the 2,598,960 ways to choose 5 elements, there are:
1,712,304 possibilities with no adjacent elements,
778,320 with one pair,
51,888 with two pairs and a singleton,
51,888 with one triple and two singletons,
2,256 with a triple and pair
2,256 with one quadruple,
and 48 with five adjacent elements.

Relative to AWB’s analysis after four non-adjacent balls, the number of neighbors range from 4 to 8. The four assumes 1 and 52 aren’t adjacent.

Four neighbors if the first four balls were 1-3-5-7.

That would be a screwup :smack:

… and that would be another screwup :smack: in one post.

Y’know, I was a little troubled that our answers were so different. Not enough to make me check my work, though :slight_smile:

I still don’t think the lottery company is trying to cheat him, though.

Hmmmmm, I guess I thought this was going to be a more straight-forward analysis, but taking the above numbers as “correct”, the answer to my OP seems to be 778,320/2,598,960 or approximately 0.3. So if I’m observing actual draws with approximately twice that probability (consistently), where is the breakdown? I should add that I have observed this phenomenon in several similar but independent games which may have differing pool sizes, i.e., Power Ball, Mega-Millions, etc., some with a 6th independent draw from a replentished pool of the same or different size, some that do not.
I’m not really trying to assign a sinister motivation for this to the lottery agencies, it could simply be a defective picking algorithm, but look at it this way: The probability that no adjacent numbers are drawn is 1,712,304/2,598,960 or 0.65. Now I will postulate, without any proof at all, other than my hunch, that most “casual” lottery players will simply have the machine pick their numbers rather than going to the trouble of marking the play slip with their own “lucky” numbers (which would “probably” tend to be non-adjacent). If the agency “forces” large numbers of “random” number players to accept draws having only a 0.3 probability of being picked, then they increase the probability of there NOT being a (jackpot) winner. Since the jackpots “roll-over”, becoming larger and larger, attracting more and more attention, resulting in more people playing, buying greater numbers of tickets, the process becomes synergistic, decidedly to the advantage of the agencies. But how to go about proving all of this, I have no idea. While IANAL (despite my username), is there an actionable concept here? And what is the probability of achieving a payout? :smiley:

RedDawg

The last part of your post is incorrect. IF the machines are picking adjacent numbers it doesn’t affect the odds of winning.

Whatever numbers are picked or whatever the method used, the odds of winning is the same for every set of numbers.

This has been an interesting discussion, but the premise is flawed. You are assuming that the method used to pick your numbers somehow affects your probability of winning. In fact any given combination of numbers is as likely to win as any other combination. A combination with two adjacent numbers may have a small probability of winning when examined in isolation, but then again so do other combinations of numbers.

Let’s look at an artificially small example. This lottery has two numbers from 1-5, you have to pick them in the right order. You pick two numbers. If order counts, there are 20 ways to pick them. There are 8 ways to pick adjacent numbers. So 40% of the picks have adjacent numbers. That does not mean that if you get a pair of adjacent numbers, your chances of winning are 40%!! It means that the chance of the winning number containing two adjacent numbers is 40%. It also means that your chance of winning is always 1/20, no matter what numbers you pick, adjacent or not.

A nod to aahala for declaring this principle while I previewed!

One last detail.

Are you certain about your recollection. Most of the time, I don’t notice insignificant events and just toss them out of the old memory. But if I see something like, the NY pick 3 numbers on September 11, 2002 being 9-1-1, then my memory just files it away as “interesting”. Perhaps you just notice runs of 2 or more more often than they occur.

OK, I will accept the analyses of aahala and CookingWithGas, but have one final question: If the picking algorithm is perfectly fair and random, what explains the (personal) observation that adjacent number pairs are occurring at approximately twice the expected frequency of 0.3?

BTW, I am now keeping all my worthless tickets and will occasionally report my long-term experiences. Of three tickets purchased for last night’s (6/17/03) Mega-Millions Game, two contained an adjacent number pair, so we’re off at 0.66.
Others are welcome to do the same.

I took it you had noticed the adjacent numbers on the tickets purchased, not the WINNING numbers.

I don’t have an URL but imagine the winning numbers for many weeks are posted somewhere on the web. It might be interesting to look at such a listing and see how frequently adjacent numbers appear. Are they in the 30-35% range as others have calculated?

Can you post the numbers? If you really are getting 2/3rds of the tickets, we may be able to figure out why.

Here’s one possibility: If I randomly choose the first number, then randomly choose the second number to be larger than the first, the third larger than the second, and so on, I get about 63 percent of the valid sets chosen having adjacent numbers, and about 16 percent having three in a row. These are pretty close to your recollection. (I also have to repick the numbers over half the time because ball 52 gets picked in the first four picks, and I run out of balls.) These figures are based on running this case in MatLab.

Hrm. Well, I started looking at this this morning before I went out, so darned if I"m not gonna post anyhow. :smiley:

There’s every method of calculation here - except empirical! Thus I reveal to you the results of ten million simulated trials:
P is: 2 = 31.999630, 3 = 2.080970, 4 = 0.086780, 5 = 0.002110.
The above are exclusive, ie, 3 together doesn’t count as two. As an additional bonus though, the probability of at least two together any old way is 34.085660

I’d be really surprised if the “random pick” numbers actually weren’t… I wonder if there’d be any reason at all for that.