Coin flips, biased coins, and confidence

I’m pretty bad at statistics, so I could use some help. My question is about coins, but it actually relates to a real life question. Also, it’s definitely not homework, since it’s summer, schools out, and I’m way too old for school (but, wait! what if I’m in Australia and this is actually during the school year?).

Anyway, I’m not even sure if I’m asking the right questions, but I’m looking for bias in the coin.

Say I flip a coin 100 times and it turns up heads 30 times. What are the odds of that happening (or, maybe the odds of it being 30 or few heads?)? To what confidence level could I say that it’s a biased coin? Or, maybe the question is, what confidence level could I say it’s a fair coin? I always get the null hypothesis backwards. I guess the null hypothesis is that it’s a fair coin?

Similarly, if I flip it 100 times and it turns up heads 5 times – odds and confidence level?

I guess I’m really looking for the formulas that I could use to calculate those things given the number of heads out of 100.

I don’t really want to get into details about the real life situation. I think a coin flip is analogous, and looking for a biased coin is also analogous to the situation.

If you Google this, you’ll find various sites that offer to do relevant calculations.

By this method I discovered that in 100 flips of a fair coin the probability of 30 or fewer heads is around 1 in 25,000.

In round numbers: very high.

Yes! I found something similar – my formula is nCr/2^n for each number (exactly 1 head, exactly 2 heads, etc). Then, I can sum up 0-30 heads and the inverse of that is 25477, which is about what you got for the probability.

BTW, we have #twinning join dates :cool:. Thanks, my SDMB twin.

To actually answer the question “is this a fair coin?”, or “what is the probability that this coin is biased?”, you need to have some prior assumption about just how common unfair coins are, among the general population of coins. Getting that few heads on a fair coin is certainly a rare event, but then, finding a biased coin is also a rare event. Something rare has certainly happened, but which rare event is it?

That’s a Bayes theorem thing, right? If biased coins are, say, less than 1% of coins and we flip a random coin 100 times and get only 30 heads, was this an unusual (1 on 25,000) event, or was it just a biased coin (1 in a 100). Something like that?

Yes, exactly, though just knowing that biased coins are less than 1% of coins isn’t enough information in this case. How much less?

Well, as I mentioned, this isn’t really about coins. I’m not even positive that coins are the proper analogy. It’s about layoffs. If some company has a layoff and, say, 25% of the people laid off are “old” as defined by various statutes, and 5% of the people are not “old”, can you show bias? What is the probability that such an outcome could happen by chance?

There are so many factors involved, it’s probably impossible to mathematically show that it’s really unlikely that 25% vs 5% is just happenstance.

Thinking about that situation led me to my coin questions to see if that would lead to fruitful results. I was really surprised that there’s only a 1 in 25000 chance that you’d get 30 or fewer heads. I thought it would be much higher.

In that case, it’s far trickier, because even if you can establish that there’s a bias, you still have to show that it’s the sort of bias that’s addressed by law. For instance, the company might choose to keep people who are up to date in their skills and training: There’s no reason an old employee can’t stay up to date, but it’s going to be easier for the recent college grads, for whom the up-to-date methods are what they just learned.

And, older employees tend to be paid more. If the layoff is intended to save money, a layoff of highly-paid employees will likely also be a layoff of older employees.

Laying off employees are not random decisions … there should be a precise method to determining which employee is sent home … and the smart money is having this method documented. Otherwise it’ll look like discrimination based on a protected class of employees.

In the UK, it it often the younger, more productive staff that get laid off first. I worked for a company that went through three waves of layoffs before folding. Decisions are skewed by the costs of making someone redundant - One weeks pay per year of service under 50 years of age, and two weeks after that. Plus any notice due.

1st wave was easy - cut the dead wood and the cheapest employees. So the old guy who kept the grounds went, along with some labourers and a couple of my drivers. Also a few older guys who negotiated early retirement.

2nd wave was harder. We had to evaluate everyone in the department and how they spent their day. This included some more expensive staff, like my assistant (it was him or me…) and one whole department went. No one in middle or higher management was included.

3rd wave was desperation - this time, I was included and walked off the site with £30k tax free compensation, three months salary (taxed) and some enhancements to my company final salary pension which was, of course frozen.

Many companies these days, especially those with a fluctuating workload, use a high proportion of agency staff, as they can be taken out, with no reason or compensation, at an hour’s notice.

Thanks, everyone! I would rather keep this on the subject of statistics, not layoffs, if possible.

You stated above this is probably too complicated to figure out, but in reality - you can do it in excel - without really understanding the math - I do this all the time when I’m not sure of the math.

What you seem to be asking is what are the chances this happened by chance alone. This is exactly what statistics are supposed to answer.

You need to know:

  1. the percentage of old “coins” vs the entire set of “coins”

Let’s say it’s 25%

  1. you need to know the total number of coins that were in the group that were thrown out.

Let us say it’s 100

Then
A) Open excel
B) in cell A1 type:

=IF(RAND()>.75,1,0)

C) Use your mouse in the lower right corner and drag it down 100 rows (or however many coins we are talking about)
D) Then drag it right 100 columns
E) use the Autosum feature on the rows

Then there are several ways you can do it - I’d do a separate column with coin if, but if you aren’t that familiar you can just copy that last row and use “paste” > “transpose”

And then sort it - the 5th row and 95th row down will give you an idea of the 5% and 95% confidence intervals.

Yes that isn’t the “correct” way to do it, and technically you might want to use different methods if you are using the overall population (the entire universe of coins vs the set of coins at that company).

You can hit shift F9 to redo the random draws to get more than one roll, but I do stuff like this all the time and it helps me to visualize what’s going on. I usually do much more than 100, but you get the idea.

Of course this ONLY answers the question as to what the likely hood that this was entirely random - as others have pointed out - and you seem to be aware - that probably isn’t the case. Unless they literally claimed “we are picking names out of a hat”.

But it is very easy to get an idea of whether it falls into a reasonable range of purely by chance.

With an infinite number of flips you have a really great confidence as to the bias of a coin. With one flip you have none. Somewhere between those two end points confidence can be very high or very small. Most regular folk get overconfident about something with a small number of data points. It’s something inbuilt into the brain. But statistically oriented folk would want a lot of data points to establish limited confidence in something.

Strangely, the results established by rigorous statistics are less likely to be believed by regular folk. The opposite effect of the above.