I don’t know why I didn’t think to ask you earlier, but supery00n’s question reminded me there’s no place like the Dope for questions like this.
The details arise from a surprising experience a buddy experienced in one of his games of HeroClix* but I’ll frame the details generically. It involves a bunch of dice rolls - each roll a single one sided-die.
I want to calculate the odds of 13 out of 15 “tests” passing.
A “test” is this: you get two chances to roll a 5-6. If roll 1 is 5-6, test over. If not, roll again. 5-6 on the second roll? pass. IOW: to fail a test, each of two rolls is 1-4.
Am I correct that each test has a 2/3 chance of passing (1/3 + 1/3)? Beyond that, my brain goes swimmy - 13/15 * 2/3 - exponent something… arrgghhh.
What are the odds that out of 15 attempts, all but two pass? I intuit the odds are crazy unlikely. Would this result be enough to raise suspicion of loaded dice?
(I created a simple simulator in Excel and ran it 100 times. Got lots of 11s, couple of 12s, no 13s)
Side question: in the spirit of “teach a guy to fish” is it reasonable in the scope of this question to explain tutorial-wise how you arrived at the answer?
Buddy was up against a dude with both Shape Change and Super Senses. He got off 15 attacks and dude dodged all but two.
There’s a 2/3 * 2/3 probability of failing any single test because the die rolls are independent, so there’s a 5/9 chance of passing it. All of the tests are independent as well, so the number of passed tests follows a binomial distribution with n = 15 and p = 5/9. Per Wolfram Alpha, the probability of passing 13 tests is about 1%.
It’s easier to look at it from the other way since if you roll a 5 or 6 the first time you don’t need to roll again. Using this part here:
Your chances rolling a 1-4 on each roll is 2/3, and to do that twice in a row you multiply the odds, so the odds of failure are 4/9 and odds of success are 5/9 for a single test. Check out probability trees, they are very helpful for visualizing this type of problem.
Do you want the odds of exactly 13 tests passing, or at least 13 tests passing?
Assuming you mean a 6 side die then the probability of rolling a 1 to 4 is 2/3. The probability of doing this twice in a row is (2/3)*(2/3) = 4/9. This is the probability of failing the test. Therefore the probability of passing is 5/9.
Now read about the binomial distribution on wikipedia
you should try to work it out from here (learn to fish sort of thing), if you get stuck I will help.
Interestingly, this question can’t be answered unless you know the probability of a given pair of dice being loaded. This is counter-intuitive, because you probably carry in your mind some “common sense” idea of how prevalent loaded dice are. You may say “the chances of that outcome (with honest dice) are only 1%, so I’m thinking there’s a good chance the dice are loaded”. But think how that would change if you knew that out of the millions of dice in the world, only three were loaded.
The Excel simulator is too manual (have to hit F9 100 times and count on one’s fingers).
Here’s an improved simulator using Python:
from random import randint
attacksPerGame = 15
surprisingThreshold = 13
numGames = 0
numOccurences = 0
print "Hit Ctrl+C to quit"
try:
while True:
numGames += 1
numDodged = 0
for i in range(1, attacksPerGame + 1):
isDodged = randint(1, 6) >= 5 # Shape Change
if not isDodged:
isDodged = randint(1, 6) >= 5 # Super Senses
if isDodged:
numDodged += 1
if numDodged >= surprisingThreshold:
numOccurences += 1
print "{0} games, {1} occurrences ({2}%)\r".format(numGames, numOccurences, (float(numOccurences) / numGames) * 100.0),
except KeyboardInterrupt:
print
pass
I let this run for 1 million iterations (took just over a minute). I’ve tried it a number of times and it seems to settle into about 1.18% This is higher than the calculated value of 0.996%. This likely reveals a “flaw” in the pseudo-random generator, but it’s pretty close.
I think it’s funny (and also a classic example of how testing turns up the unexpected) that your simulator, written in part to test if dice were loaded, reveals that the computer pseudo-random number generator is, itself, “loaded.”
Even funnier, the actual game in question was played online. There were sufficient calls of “unfair dice” to prompt the developers of the game to post the actual code of the dice roll function.
Thanks for this - gave me a good pause for thought.
If the question is “what are the odds the dice are loaded” then you are correct.
But the question is “is it enough to raise suspicion?” More of a gut-check kind of thing. Irrespective of the number of loaded dice that exist, if the calculated odds were 0.000001% I would say “something is up here”. Given the odds are ~1% it’s definitely “lucky”, but in the gut-realm of picking 3 numbers in a 6/49 - it’s unlikely but it happens.
What you’re feeling your way to is actually what most statistical inference is about. One more formal way of doing your gut-check is that you do a ‘sample’ number of ‘tests’. In this case, since the underlying distribution is known to us, we can calculate the expected value of the proportion of successes, as well as its standard deviation. These calculated values would be the ‘population’ mean and standard deviation. If the proportion of successes in the sample tests that you conducted deviates from the population mean by a given amount, you can say with some reasonable certainty that the dice is loaded. How much that amount is depends on how much certainty you want in your statement, and how many tests you conducted for your sample. So for instance, the deviation between your simulation and the calculated value can be used to check the likelihood of the simulator being ‘loaded’
Your simulator considers, in effect, 13 or more passes (out of 15) to be a success, while the 0.996% figure cited upthread was for exactly 13 passes out of 15. For the 13 or more criterion, 1.1887% is expected.
Crying “Flaw” on the pseudo-random generator should be a last resort, not first.
Ah! I certainly was interested in 13 or more. And I haven’t yet read the wiki article on binomial distribution so I missed that detail on what ultrafilter actually calculated.
Adjusting the script, ran another 1 million iterations with the result of 0.9903. That’s more like it.
For some reason, I was expecting some complicated code, but that’s the most basic randomization possible, and I’m not sure I even recognize the language. It literally just says “Pick a number between 0 and 5, add 1, and then write it down.”