I heard some people who regularly buy lottery tickets saying that it’s interesting how often two (or possibly more) of the numbers selected are consecutive. What are the odds that in selecting numbers from 1-53, with no duplications, that at least two of them will be consecutive. (Not necessarily selected consecutively, though.)
That’s a tough question.
I’m not sure how to approach the problem from an analytical standpoint, but [sub]53[/sub]C[sub]5[/sub] is a fairly small number, so brute force is fine and dandy here. There are 2,869,685 possible outcomes, and 536,130 of them have at least two consecutive numbers (not considering the order in which the numbers were picked). That works out to a probability of a little over .18.
For those who might want to check for errors, here’s my code:
#include "stdafx.h"
#include "math.h"
#include "windows.h"
// Code taken from http://doodleproject.sourceforge.net/algorithms/
bool enumerateCombinations(int n, int k, int j[])
/*-------------------------------------------------------------------
Description:
Enumerates all possible combinations of choosing k objects from n
distint objects. Initialize the enumeration by setting j[0] to a
negative value. Then, each call to enumerateCombinations will
generate the next combinations and place it in j[0..k-1]. A
return value of false indicates there are no more combinations to
generate. j needs to be allocated with a size for at least k
elements.
Language:
C++
Usage:
int comb[10] = {-1};
while(enumerateCombinations(10, 5, comb)){
// do something with comb[0..4]
}
Reference:
Tucker, Allen. Applied Combinatorics. 3rd Ed. 1994.
-------------------------------------------------------------------*/
{
int i;
if(j[0] < 0){
for(i = 0; i < k; ++i){
j* = i;
}
return true;
} else {
for(i = k - 1; i >= 0 && j* >= n - k + i; --i){}
if(i >= 0){
j*++;
int m;
for(m = i + 1; m < k; ++m){
j[m] = j[m-1] + 1;
}
return true;
} else {
return false;
}
}
}
void insertionSort(int array[], int lo, int hi)
/*-------------------------------------------------------------------
Description:
Sorts the elements of array[lo..hi] into ascending numerical order
using an insertion sort.
Language:
C
Usage:
insertionSort(array, 3, 6); Sort elements 3 through 6 of array.
-------------------------------------------------------------------*/
{
int i;
int j;
int m;
for(i = lo + 1; i <= hi; ++i){
m = array*;
j = i - 1;
while(j >= lo && array[j] > m){
array[j + 1] = array[j];
--j;
}
array[j + 1] = m;
}
}
// My code
void check( int j[], int n, unsigned long& x )
{
bool b = false;
for ( int k = 0; k < n - 1; ++k )
if ( (j[k + 1] - j[k]) == 1 )
b = true;
if ( b )
x++;
}
int main()
{
unsigned long x = 0;
int j[5];
while ( enumerateCombinations( 53, 5, j ) )
{
insertionSort( j, 0, 4 );
check( j, 4, x );
}
printf( " X: %d
", x );
return 0;
}
First of all, events with a low probability that “look cool” (like the clock saying 12:34 or 11:11) have a much higher perceived probability, because you notice them and remember them. No one says “Wow! The average difference between any two pairs of winning lottery numbers for this week was 6.75, just like it was two weeks ago!” because your mind doesn’t automatically notice things like that.
Now, without using a computer (because I don’t know how), let’s say the first number you pick is n. Now, of the 52 possible numbers you could choose for your second number, (n-1) or (n+1) will form a consecutive pair. (Let’s say that 1-1=53 and 53+1=1) Already we have a probability of 2/52 = 0.0385. But assume you chose one of the other 50 numbers (m) instead. There are now three or four numbers that will form a pair, (n+1), (n-1), (m+1), and (m-1). In the case that |n-m| = 2 (which happens with probability 2/51 = 0.0392), there are only three. See where I’m going with this? If someone hasn’t already, I’ll finish up this line of thinking in a day or two.
Yes indeedy. The thing is that you need to take into account case where some numbers are distributed close (ie, only one number away) from other numbers. I think I can give an upper and lower bound, though (and ultrafilter can check me if I’m wrong).
Assuming all numbers wind up distributed with exactly one number in between (1-3-5-7-9, for example), I get the chance of NOT getting consecutive numbers as (5149474543)/(5251504948), and the chance of getting consecutive numbers as 27.11%. This should be a lower bound.
Assuming all the numbers get distributed with more than one number in between, I get the chance of NOT getting consecutive numbers as (5047444139)/(5251504948), and the chance of getting consecutive numbers as 48.35%. This should be an upper bound.
I suspect the real answer should lie closer to the upper bound than the lower bound, as numbers typically tend to not all be clustered. I’ll take SWAG at 44%.
I’m curious what the actual answer is, though.
I believe the answer is about 33%.
I thought this exact question was asked and correctly answered sometime last year, but I can’t find it with search.
This person says that the answer for a 45 number lottery is 40/45C6
http://www.aims.ac.za/~discus/discus/messages/8/178.html?1064862906
http://silver.sdsmt.edu/~rwjohnso/module5.htm
Apparently, betting consecutive numbers is a good way to avoid sharing one’s winnings.
http://news.bbc.co.uk/1/hi/sci/tech/240734.stm
But according to this source, it has no winning ticket has ever conained six consecutive numbers:
Well, if you’d like another answer…
I like to do these things statistically since it requires no clever thinking, which means it’s harder to mess up. Anyway, I just ran 10 million lotteries and asked how often I got at least 2 consecutive numbers. My answer:
prob = 0.46560 +/- 0.00016
where the error is due to finite statistics.
It is just as likely that two selected numbers will be consecutive as it is that they will be any other two numbers that are selected in advance.
By the way, I assumed you meant a 6-ball lottery…
Anyway, here’s what I get for the probabilities of getting at least N consecutive numbers, for all six values of N. (100 million lotteries run. Again, errors shown are due to finite statistics.)
P(1) = 1.0 +/- 0.0
P(2) = 0.465583 +/- 0.000068
P(3) = 0.040926 +/- 0.000020
P(4) = 0.0024483 +/- 0.0000049
P(5) = 0.00009873 +/- 0.00000099
P(6) = 0.00000202 +/- 0.00000013
It’s easy enough to verify the lower numbers in Pasta’s table analytically. (I also assumed a six-ball game.)
There are 48 possible outcomes with six consecutive numbers, running from 1-6 through 48-53. So:
P(6) = 48 / 22,957,480 = 0.00000209
A string of 5 can begin with 1 through 49, with the sixth ball being any of 48 choices. But this doulbe-counts the strings of 6. So:
P(5) = 49*48 / 22,957,480 - P(6) = 0.00010036. This is about one and one-half standard deviations greater than Pasta’s number.
A string of 4 can begin with 1 through 50, with the fifth and sixth balls being anything. This double-counts the strings of 5 and triple-counts the strings of 6. So:
P(4) = (504948/2) / 22,957,480 - P(5) - P(6) = 0.002459
By extension,
P(3) = (515049*48/6) / 22,957,480 - P(4) - P(5) - P(6) = 0.04098
However, this is a slight error, even though it agrees with the simulation. It double-counts the small epsilon of outcomes consisting of two discontinuous strings of 3. There are only about 1,000 of these, so it doesn’t much skew the answer.
But for P(2), there are loads of two-pair outcomes, and my methodology breaks down. Perhaps someone else can finish it.
I pondered this question a few years ago myself. I was thinking about the Illinois lottery, I believe at the time there were 50 balls to choose from and you had to match all 6 balls chosen. I’m not sure MY inquiry was exactly the same as the OP, my question was:
After the 6 lottery balls are chosen, what are the odds there will be AT LEAST two numbers that are consecutive?
So, this includes, all possible 6 consecutive number draws, all possible 5 consecutive number draws … all possible 2 consecutive number draws.
Note that a draw may have TWO 2 consecutive numbers, i.e. 10 11 17 23 24 48, so you need to take into account ALL of the permutations of this type of draw. Likewise, a draw may contain 3 consecutive numbers and 2 consecutive numbers, i.e. 12 13 19 20 21 43. So, you need to think about all the possible 2, 3, 4, 5, and 6 consecutive number permutations.
I got tired of trying to figure it out theoretically, so I wrote a program that generated all the possible combinations for the drawing. I belive my program calculated somewhere around 52 percent of the time the lottery drawing will contain AT LEAST 2 consecutive numbers. Perhaps the converse is easier to think about, that is, what is the probability NO two numbers will be consecutive in the drawing.
Here’s a Basic4GL program that figures the percentage we are looking for. I know, it’s not a very elegant solution, but it works.
BTW, the answer for a “draw 6” lottery is:
46.5% for a 53 number pool
47.3% for a 52 number pool
48.0% for a 51 number pool
48.7% for a 50 number pool (<-- my recollection was wrong about this in my previous post)
'-----------------------------------------------------------------------------------------------------
const MAX_NUM = 50
dim a
dim b
dim c
dim d
dim e
dim f
dim con_count
dim total_count
dim percent#
a = 1
b = 2
c = 3
d = 4
e = 5
f = 6
con_count = 0
total_count = 0
percent# = 0
while (a <= MAX_NUM - 5)
total_count = total_count + 1
'printr
'print "N" + total_count + "a" + a + "b" + b + "c" + c + "d" + d + "e" + e + "f" + f
if (a = (b-1)) or (b = (c-1)) or (c = (d-1)) or (d = (e-1)) or (e = (f-1)) then
con_count = con_count + 1
endif
f = f + 1
if (f > MAX_NUM) then
e = e + 1
f = e + 1
endif
if (e > MAX_NUM - 1) then
d = d + 1
e = d + 1
f = e + 1
endif
if (d > MAX_NUM - 2) then
c = c + 1
d = c + 1
e = d + 1
f = e + 1
endif
if (c > MAX_NUM - 3) then
b = b + 1
c = b + 1
d = c + 1
e = d + 1
f = e + 1
endif
if (b > MAX_NUM - 4) then
a = a + 1
b = a + 1
c = b + 1
d = c + 1
e = d + 1
f = e + 1
printr
print "ROLL"
endif
wend
percent# = con_count
percent# = percent# / total_count
printr
print "cc " + con_count + " tc " + total_count + " P " + percent#
I just realized something–the OP doesn’t specify how many numbers are drawn. I was working off 5 earlier, but with 6, here’s what I get:
P(6) = 10685968/22957480
P(5) = 939672/22957480
P(4) = 56448/22957480
P(3) = 2304/22957480
P(2) = 48/22957480
P(n) is the probability of getting at least n consecutive numbers in a sorted combination.
Here are the relevant changes in my code:
unsigned long x = 0;
int j[6] = { -1 } ;
while ( enumerateCombinations( 53, 6, j ) )
{
insertionSort( j, 0, 5 );
check( j, 6, x );
}
printf( " X: %d
", x );
My guess was based upon 5 balls also. I had roughed out a mathematical WAG. Then since 53 balls were in powerball, I looked at last years results, that’s how I got my guess of .33, which was the approximate frequency for last year.
I think you got a transposition there, pardner. How about:
P(2) = 10685968/22957480
P(3) = 939672/22957480
P(4) = 56448/22957480
P(5) = 2304/22957480
P(6) = 48/22957480
Thanks.
The quick solution (without resorting to writing code) is
1-C(48,6)/C(53,6) = .46546781 if you’re drawing 6 numbers from 53, or
1-C(49,5)/C(53,5) = .33550756 if you’re drawing 5 numbers from 53.
It’s easier to work with the complementary event.
My Notation “P2[sup]nd[/sup][sub]cons[/sub]1[sup]st[/sup]” means the probability of the 2nd being consecutive to the 1st
If we let 49 & 1 be ‘consecutive’, then we can say that the probability of the 2nd ball being consecutive to the first is 2/48, as there are going to be two of the remaining 48 balls that are consecutive to the first one picked.
P2[sup]nd[/sup][sub]cons[/sub]1[sup]st[/sup]" = 2/48
= 0.041666666
The probability of the 3rd being consecutive to the 2nd is going to be 2/47 MINUS the probability that one of these two balls was removed on the first ball (2/47 x 1/49)
P3[sup]rd[/sup][sub]cons[/sub]2[sup]nd[/sup] = (2/47) - (2/47.49)
= 96/2303
= 0.041684759
The probability of the 4th being consecutive to the 3rd is going to be 2/46 MINUS the probability that either of these two balls was the 1st OR 2nd ball:
P4[sup]th[/sup][sub]cons[/sub]3[sup]rd[/sup] = (2/46) - [(2/46 x 1/49) + (2/46 x 1/48)]
= 2255/54096
= 0.041685152
and so on…
P5[sup]th[/sup][sub]cons[/sub]4[sup]th[/sup] = (2/45) - [(2/45 x 1/49) + (2/45 x 1/48) + (2/45 x 1/47)]
= 0.041665862
P6[sup]th[/sup][sub]cons[/sub]5[sup]th[/sup] = (2/44) - [(2/44 x 1/49) + (2/44 x 1/48) + (2/44 x 1/47) + (2/44 x 1/46)]
= 0.041624671
Probabilities of being consecutive to the preceding number:
2nd = 0.041666666
3rd = 0.041684759
4th = 0.041685152
5th = 0.041665862
6th = 0.041624671
total = 0.20832711
So roughly every 1 in 5 draws of the UK National Lottery will have at least one set of consecutive numbers (including 1 & 49).
Now that number seems about right to me, but I’m sure I can’t have got the calculations right. Someone will be along to correct me soon!
I only got an ‘E’ on my first attempt at Statistics 1 (A-Level) and a ‘C’ on my retake. I’m good enough at Pure Maths, but Stats is tedious (as proven above)!
Oh, and usually I wouldn’t work with decimals, hate the things! But my calculator wouldn’t work in fractions after the first couple.
Ok, now I realise that you wanted a lottery system with 53 balls. But the principle is the same, just adjust some of the numbers. Not doing it again. If it’s right, that is!
Cheers,
Harry