To do this properly, you need to use Baysian methods, which means that you need to already have some sort of prior. If I pull a coin out of my pocket that I got as change from the cafeteria, and flip, say, 7 heads in a row, I’m going to assume that I got lucky. If, on the other hand, I found the coin at a magician’s convention, then I’m going to assume after 7 heads that it’s a two-headed coin (assuming I didn’t just check directly).
What you’re probably thinking of is the usual statistical calculation of “How likely was the null hypothesis to have produced this result?”, but that’s not as useful as it’s usually made out to be. If I flip that coin in my pocket 7 times and get all heads, I know that something unlikely happened, since a fair coin will only produce that result of order 1% of the time. But the problem is that finding a trick coin in my pocket change is also an unlikely event, and I don’t actually know which unlikely event happened.
I don’t recall off the top of my head what they are, but my Quantitative Analysis textbook from Chemistry 25 years ago had various statistical techniques for detecting systematic bias in your experiments. I’d pull out the book and start with those. Like I said, too much work for such a small effect on the probability, though.
Don’t try to explain. Instead offer him a game using four coins. You win whenever you flip 2 heads and 3 chances in 8) and he wins whenever you flip 3 heads and one tail (1 chance in 4) and no bet for any other combination. Let him see how quickly he loses.
Probability can be tricky as the Monte Hall problem demonstrates. Brian Hayes (who writes a math/CS column for The American Scientist) had to do a computer simulation before he was convinced that the answer is correct. He told this story on himself in a column.
Needless to say, 100 flips is unlikely to result in 50-50 (while the odds of 1-99 are less than 10^{-28} which is close enough to impossibility for me to call it impossible). But 100 flips will, I think, most likely give you a number between 45 and 55 so exactly 50 can’t be that unlikely.
It’s almost certain that you were taught traditional hypothesis testing, and that you’re misremembering exactly what the p-value means. It is most emphatically not the probability that coin is biased given the results you’ve seen. There is a way to get that, but it’s not taught in basic statistics classes for scientists.
What is the significance of the lack of capitalization? I don’t work in this area as you do, so I am prepared to defer to your knowledge of the conventions, but just a quick Google out of curiosity reveals only instances of “Bayesian” with the capitalization retained (even for “Bayesian inference” and “Bayesian approach”).
I read something that led me to believe that it was usually uncapitalized, but I can’t find it, and I’ve seen enough authors who do capitalize it that I may be mistaken.
You are very likely correct. I studied it 25 years ago, and worked in an entirely different field which didn’t use it, so misremembering is a very strong probability. Statistically speaking.
In addition to the “how many times does 1 head turn up on N flips compared to 50% heads on N flips” go through the fact that to have all heads, there is only one possible sequence of events. To have 99 heads and 1 tail, there are 100 (as you mentioned). To have 98 heads and 2 tails, there are 4950 possible solutions. For 3 tails it is 161,700. And so on.
Here’s a handy online calculator to use. Plug in “ch(100, X)”, where X is however many tails you want to turn up in 100 flips.
Bayesian methods are catching on among scientists in the past few years. I don’t know how many classes are teaching it, but it’s something you’re likely to learn once you start doing research.
I’m not sure that it’s taught well in those contexts, though: Too many folks don’t understand the importance of choosing a good prior. I’ve heard folks who really ought to know better say things like “Oh, it doesn’t really matter much what prior you choose; the results will come out about the same anyway”, or “Oh, we just went with a uniform prior, since we don’t have any prior information”.
True, though it should be standard practice (and it is in the physics fields I am familiar with) to check how sensitive your result is to different priors. It is relatively straightforward to propagate the uncertainty in your prior to an error bar on your result. The difficult part is the argument over how uncertain your prior is.