The probability that I am not lucky.

(1) The simplified question.

Let’s say that I have a coin which is unbalanced, but I don’t know which way, only that one side will appear 49.75% of the time, and the other will appear 50.25% of the time. How many times will I need to flip the coin and what will the result have to be so that I can state with certainty (95%+) that I know which coin I have?

Now, I asked this on another message board, and the calculation was done such that it calculated the number of flips such that an imbalance of a single flip would tell me which I had (e.g. do 8,000 flips, and if you have >4,000 heads then you’re 95% likely to have the heads-favoring coin).

However, this answer is unsatisfactory. Let’s say I make 1,000 flips, and I have 600 heads. As unlikely as this scenario is, if it were to occur, I could probably state with 95% confidence that I have the heads coin. So it seems to me that the answer of how many flips are required is a function of what the result is. What is that function?

I would guess that for a plot of probability of result vs. result, there are two bell-shaped curves for P[sub]heads[/sub] and P[sub]tails[/sub] offset on the x-axis, and that for a given n flips, you can calculate x=x[sub]95%[/sub] where P[sub]heads[/sub]/P[sub]tails[/sub] = 20. That would give a point where, if you had more than x heads out of n flips, you could be certain you had the heads coin.

But I don’t know how to do that mathematically. :frowning:

(2) The more complex, but more realistic, question.

Backstory: over the years, I’ve been a net winner at blackjack. However, I don’t think (based on intuition) that the amount I’ve won has exceeded the expected variation for the number of hands I’ve played. Basically, I might be playing with a disadvantage and extremely lucky, playing with a small advantage and moderately lucky, or playing with a large advantage and not lucky at all.

In a sentence, I would like to know how lucky I am.

The problem is that I don’t even know how to begin calculating standard deviations for things this complex. In the simplest form that I can come up with, normal basic strategy play (a) would consist of 43% wins, 49% losses, and 8% ties. However, the wins count as 1.125x a normal bet (because of doubling and splitting) and the overall EV is -0.00625. If I’m playing with a small advantage (b), I’d be winning 44% of the time and losing 48% of the time, but wins count 1.32x and losses 1.2x (EV = +0.0048). If I’m playing with a large advantage (c), I’d be winning 45% of the time and losing 47% of the time, but wins count 1.32x and losses 1.2x (EV = +0.03).

I’ve played maybe 4,000 hands, and I’m up about 120 bets. (No, I am not a high roller - I’ve played a mix of $3 and $5 blackjack, and I’m up a few hundred.) This seems to suggest an EV of +0.03 (which is astronomical as far as I’m concerned) but my intuition tells me that the SD for all three scenarios are going to overlap with only 4,000 hands.

Is there any way, mathematically, to assign the probabilities that I am (a) a bad player but very lucky, (b) a good player but still lucky, or (c) a blackjack god :smiley: and not lucky at all? What if I admitted I am not a blackjack god and just limited the choices to (a) and (b)?

  1. In this case, the number of heads after n trials (call it H) is a binomial random variable with parameters (n, p). If X = k, you can compute P(X < k) or P(k < X) depending on whether k is above or below the mean for each value of p. Then you can choose the one that gives you the higher probability.

  2. No actual answer, but blackjack is a game involving significant luck, so there’s no chance that you haven’t been lucky. Given that the expected winnings per play is negative, if you really are up after 4000 plays, you’re probably extremely lucky.

Thanks!

Is there a way I can do this in Excel 2003? It has problems with large numbers (e.g. C(4000,100)).

Let’s just assume there’s a non-zero probability that EV is positive (card counting). Is there any way to quantify how lucky I am? For example, some sort of formula that will allow me to say, “if I am a normal blackjack player, I am 99th percentile in winnings, but if I am an advantage player, I am 75th percentile in winnings”?

Probably not. Excel is utter crap as statistical software goes.

The problem there is that there aren’t just a few discrete categories. You might, for instance, be slightly better than average (due to counting, say), but still not good enough that your expectation is still slightly negative. The world is not just made up of “normal players” and “advantage players”, but a range everywhere in between.

No mathematician, much less statistician, I, but I don’t share your intuition here. It seems to me if out of 1000 flips I got 600 heads, I would have very little, if any, basis for concluding I had a 50.25% weighted coin–because 600/1000 is far more than should be expected for such a coin. It seems like what I should think is “Whatever coin this is, I’ve certainly had an odd run of heads. No idea if it’s really strange or else just slightly less really strange, though.”

-FrL-

I simplified to two discrete categories in hopes of simplifying the problem. Of course, if that’s unnecessary, I’d take a more general solution.

But that’s the thing. If you can quantify “really strange” and “slightly less really strange”, wouldn’t that be a measure of probability?

Let’s take a more extreme example: let’s say there’s a coin which is weighted to favor one side 90% of the time. With a single flip, you can have 90% confidence of guessing which side is favored. Can’t you do the same with a less-weighted coin and more flips?

This second paragraph doesn’t sound right. The range of possible results is “baked” into the calculations. The “if it were to occur” is already part of the test. Not all results are equally likely after all, and how likely the are is dependent on which coin you have.

That’s kinda the whole point of statistics. Some results are more likely than others. You you can determine for any result what the interpretation of that result is, and you can figure it out before doing any flipping. You can then “sum” across all those different results to get the statistical rule that you are looking for.

Well, to adequately address this question fully we’ll need to know exactly how certain you need to be that you are lucky or unlucky.

The name of the number that quantifies how, “lucky,” you are in a sense is a p-value, which is the probability that you would observe a certain difference if your null hypothesis were true (your null hypthesis being that you really aren’t lucky).

I don’t know exactly how to calculate p-values for every statistical test (and there are a lot of statistical tests out there and knowing how and when to use the appropriate ones is a very important skill), but if you wanted to noodle around with some examples, unlike Ultrafiller, I’d say Excel is a great place to start.

To address your coin-flipping question, here’s some empirical results from Excel. The help-section of Excel and online resources can help you find the exact means of calculating these values.

I used a chi-square analysis for:
Flipping 10 coins and 6 coins being heads: p-value=0.527089257
Flipping 100 coins and 60 being heads: p-value=0.04550027
Flipping 1000 coins and 600 being heads: p-value=2.53963E-10
Often times in science, we arbitrarily claim that a p-value less than 0.05 is significant. So in this case, flipping more than 60 heads in 100 flips is, “significantly lucky,” but again it’s arbitrary and that means that 1 in 20 of your truly non-significant associations will appear to be, “significant.”

I’ll try to think more about the Blackjack, but someone with more confidence in choosing the appropriate test will probably be here first.

The OP specified 95% certainty, which I take to mean the standard 0.05 p-value.

I think Ultrafilter sent you down the right road. Take a binomial variable. Probabilty of a win = p. What is p? I did this by using your numbers of 43% win 49% loss to get 46.7% true win percentage. A much better way is to look up somewhere your average % using best strategy, which is a known number.

Flip that coin 4,000 times. What are the odds of getting 2,060 or more wins? (2,060 wins compared to 1,940 losses would get you the 120 bets you are currently ahead.) That result is what you’re looking for, a measure of how unlikely your outcome is.

I used this calculator, and got a result of 0.0000., which suggest this outcome is so unlikely as to be off their chart.

That’s using the terrible .467 number though, your true odds at blackjack are better. I guess it’s more like .49, which gets you a result of 0.0008. This seems more reasonable, that you’re roughly a 1-in-1000 player.

This site will calculate the house edge at blackjack, depending on the casino rules. By combining these two sites you can get your answer.

Yes, but that’s different. More analogous to the OP would be: Suppose you know you either have a normal coin, or a coin weighted 90% towards heads. Suppose you flip it, and it comes up heads? How certain should you be that its the weighted coin?

I think the answer is, not very certain at all. But I’m by no means confident about this. For if it came up tails, it seems like you can be fairly cerain it’s not the weighted coin. Is it possible that the tails outcome is informative while the heads outcome is not? I don’t know. It feels like that’s the way it is, but we all know how bad “feels” can mess you up in probability.

Anyway, I’ve just worked it out on paper and convinced myself that in the example I just gave, if you get heads, you can be slightly certain that its the weighted coin, while if you get tails, you can be very certain its not. What I need to work out is whether the probability that its the weighted coin goes up with each consecutive heads flip, and whether it goes up at the same rate for each flip, and so on.

-FrL-

I’d use Bayes’ theorem with my prior set to 1/2 here.

I’ve never dealt with Bayes’ theorem before, so I tried it. But I’m having a problem–I get answers greater than 1. Here’s an example.

Let A be “I have the 9/10 weighted coin,” and B be “The result of 3 tosses is a string of 3 heads.”

I set the prior probability of A to be 1/2. I take the probability of B given A to be (9/10)^3 – the chance of getting three heads in a roll where each heads has a chance of 9/10, right? And I take the prior probability of B to be (7/10)^3, for I take the probability of a heads flip on a single toss (given we don’t know which coin we have) to be 7/10. For it’s equally likely that we have the weighted coin and that we have the non-weighted coin. In the non-weighted case, out of ten flips, we should expect 5 H and 5 T, and in the weighted case, 9 H and 1 T. That means 9 + 5 = 14 out of 20 tosses should be Heads. 14/20 = 7/10. So three tosses resulting in three heads should be (7/10)^3.

But I must have made a mistake up there somewhere, because bayes’ theorem, with those numbers, gives me 1.063. That’s greater than 1, which I am pretty sure is a problem.

Where was my mistake?

-FrL-

aptronym, Your first question is a very specific and well-stated stat question:
(1) The simplified question.
Let’s say that I have a coin which is unbalanced, but I don’t know which way, only that one side will appear 49.75% of the time, and the other will appear 50.25% of the time. How many times will I need to flip the coin and what will the result have to be so that I can state with certainty (95%+) that I know which coin I have?

And your intuition is correct when you said “it seems to me that the answer of how many flips are required is a function of what the result is”.

In the previous responses, you got some good answers. Here’s my take. First of all, this has nothing to do with the chi-square distribution. Now, what does “95% certainty” mean in your coin toss experiment? It means that you are willing to tolerate no more than a 5% probability that your experimental results could have occurred by chance. In statistics parlance, your p-value is 5%. Next, how do you compute the p-value in your experiment? As already stated, the underlying distribution at any stopping point (after a set number of tosses) in your experiment is the binomial distribution. Under the assumption that the coin in question has a probability p of coming up, say, heads in each toss, it is a simple matter to calculate the probability that you had <= x or >= n-x heads in the n tosses so far (look it up in any basic stat book or table, and yes, Excel can be used here). So after n tosses, if this probability is less than your stipulated p-value, then you can be at least 95% sure that the coin does not have a probability p of coming up heads. Your question can also be framed as a simple test of hypothesis about p, with a significance level of 5%.

The probability of seeing three heads isn’t (7/10)^3. It’s (9/10)^3 * 1/2 + (1/2)^3 * 1/2. If I plug this in, the posterior of A is 729/854.

That said, I’d update after every flip. As you denoted, A is the event that I have the biased coin. H is the event that a given flip results in heads. Bayes’ theorem says that P(A|H) = P(H|A)P(A)/P(H). I start with P(A) = 1/2. P(H|A) = 9/10, so P(H|A)P(A) = 9/20. P(H) = 9/10 * 1/2 + 1/2 * 1/2 = 7/10. P(A|H) = 9/20 * 10/7 = 9/14.

For the second flip, P(H|A)P(A) = 81/140 and P(H) = 53/70, so P(A|H) = 81/106. For the third flip, P(H|A)P(A) = 729/1060 and P(H) = 1027/1325, so P(A|H) = 3645/4108.

I don’t know why those two numbers are different (although they’re pretty close). You’ll have to wait for an actual Bayesian to answer that.

Got it. Because the probability is partially determined by the fact that the three heads will have come up on the same coin in each of the three cases.

Do you mean changing the value of some prior or something with each round of flipping?

Which two numbers?

-FrL-

I felt the same way, this question is more analogous.

With one flip:
Heads: 9/14 of the time it’s the biased coin, 5/14 of the time it’s the fair coin. Not very overwhelming evidence, 64%
Tails: From the fair coin 5/6 of the time and 1/6 from the biased coin. 5:1 against, or 83%.

So getting a tails is indeed more informative.

The final posterior derived from updating your posteriors after every flip, and the one from waiting until all three flips are done. I wouldn’t be surprised if they converge as the number of events increases.

This seems close to what I am looking for in part (1) here.

Okay, so let’s say A is flipping 120 more heads than tails out of 4,000 flips, and B is having the heads-favoring coin.

P(A|B) is C(4000,2060)*0.5025[sup]2060[/sup]*0.4975[sup]1940[/sup].

P(B|A) should be 0.95, because we want to be 95% sure of B given A.

P(A) should be P(B)*P(A|B)+(1-P(B))*C(4000,1940)*0.5025[sup]1940[/sup]*0.4975[sup]2060[/sup].

P(B) could be solved from those three values.

Is everything I did kosher?