Demonstrating the Law of Large Numbers

My students are doing an experiment showing the LLN and I want to make sure I have it set up correctly. Let’s say the students are rolling a die and counting the number of times a 6 comes up. p(6) = 1/6 and if I have them record the results then [(# of 6s) ÷ (Total rolls)] - 1/6 should tend to 0 as the number of rolls increases? Am I correct in that that activity would show the LLN?

If this does show the LLN, then I presume that an equivalent would be [(# of 6s) - E(6)] ÷ rolls => 0 as the rolls increase.

Both of your statements are exactly mathematically equivalent, and both follow from the Law of Large Numbers.

Thanks.

I’m wondering whether this exercise would work better with a 50-50 probability. Large numbers can be, well, large. I simulated 1 million dice rolls on my computer and counted the number of sixes (p=.1667). Here are the first 20 rolls:

. list runningAve in 1/20, clean

       running~e  
  1.           0  
  2.           0  
  3.           0  
  4.           0  
  5.          .2  
  6.   .16666667  
  7.   .14285714  
  8.         .25  
  9.   .22222222  
 10.          .2  
 11.   .18181818  
 12.   .16666667  
 13.   .15384615  
 14.   .21428571  
 15.          .2  
 16.       .1875  
 17.   .17647059  
 18.   .22222222  
 19.   .21052632  
 20.          .2  

We get a 1/6th share twice, but the numbers at 9 don’t seem too different than the numbers at 20. What after 100 or 200 rolls?

100.         .2  
...
200.       .215  

Not too different. Let’s jump to 1000 and 10,000

1000.       .185  
...
10000.      .1663  

Ok, now we’re getting somewhere. But that’s a lot of dice rolls. 100,000 and 1,000,000:

100000.     .16779  
...
1000000.    .166953  

More convincing.

I charted the first 200 rolls and while I can see the law at work, it doesn’t really converge until later. Around roll 100 it seems to center on .20, then awkwardly jumps higher to .22. This run didn’t drop back below .17 until after roll 2400.

Anyway, if you have a few groups, many of them will get odd results, so they shouldn’t get discouraged.

Roughly speaking, you can expect the error in your result to be within \frac{1}{\sqrt{N}} of the expected value. So after 100 rolls, you’ll be about 10% off; 10,000 rolls will have about 1% error, and 1,000,000 rolls will have 0.1% error. And within a small integer factor, that’s consistent with your results.

Reiterating for emphasis. While it’s nice to show results converging, it’s even better to show we understand how they converge.

Nice point Dr. Strangelove. I’m going to walk through a formula, but I think your rough approximation is solid.

Estimate of the standard error of a proportion:

sqrt( p * (1-p))
----------------
  sqrt(N)  

The 95% confidence interval covers +/- roughly 2 standard errors.

So: for 1/2 and a sample size of 100, the 95% confidence interval is indeed +/- .10. For a 1/6 probability, as in a dice roll, the confidence interval is .0745, which is… smaller. Looks like my intuition failed me upthread. The denominator is indeed sqrt(N).

(The above formula works for N>5 and p*N>5: smaller samples use a different formula.)

You could also teach them the frivolous law of numbers: any random number is likely to be very, very, very large. :wink: