Fascinating probability question

Wasted all morning solving this. I know the answer, but I’d love a really satisfying and intuitively understandable explanation of why the answer is the answer.

There’s a basketball player practicing free throws. He has the odd characteristic that once he’s made at least one free throw and missed at least one free throw, his probability of making any further free throw is precisely equal to the fraction that he has made so far this session. So if he’s taken 8 shots and has made 5 of them, he’s 5/8 likely to make the next one, 3/8 likely to miss it.

His coach watches him, and sees him make the first, and miss the second. The coach then leaves for a while, and comes back. When he comes back, the player has taken 98 throws. The coach then watches the player take throw #99, and make it. From the coach’s perspective, what is the probability that the player will make throw #100?

(Puzzle originated at 538.com)

Ok, explain why it isn’t 5/8ths likely to hit and 3/8ths likely to miss.

Note that, for any particular sequence of hits and misses, with H hits and M misses (not counting the always presumed initial trivial hit followed by a trivial miss), the probability of achieving this particular sequence is (H! * M!) / (H + M + 1)!

(This because the first hit has probability 1/(number of previous throws), the second hit has probability 2/(number of previous throws), the third hit has probability 3/(number of previous throws), etc. And similarly for the misses. So our numerator gets a contribution of H! from the hits and a contribution of M! from the misses, and our denominator gets a contribution of 2 * 3 * 4 * … * (H + M + 1), from the first nontrivial throw [with 2 previous throws] through the last throw [with H + M + 1 previous throws])

In particular, the specific ordering of hits and misses within the sequence doesn’t matter at all for its probability.

Thus, the question is as good as if the coach saw the player make throw #3 and was interested in the probability the player would then make throw #4. This would be 2/3 by stipulation (2 previous hits out of 3 previous attempts).

Tl;dr: The answer is 2/3.

Well for one thing, he just made shot 99 so the coach has seen 6 makes and 3 misses which would give a 2/3 chance of making it just based on what the coach has seen.

But, you need to assess the relative likelihood of his having made the 99 shot given the previous information. That might all cancel leaving 2/3 as the right answer.

I suspect the answer to this uses a martingale property. As the probability that the play makes the nth shot is equal to the fraction of previous successful shots, the success probability is a martingale. So I think the probability of making the 99th shot based on the known information is 5/8 and the correct answer is 6/9 = 2/3; however, I’d want to work it out with Bayes Law to be sure.

Put another way: At any moment in the game, the probability of a hit followed by a miss is the same as the probability of a miss followed by a hit. (If there were H hits and M misses previously out of T total shots, then the probability of a hit followed by a miss is H/T * M/(T + 1), and of the other way around is M/T * H/(T + 1), which is equal). What’s more, either order results in the same probabilities going forward (since there will be the same number of hits and misses total going forward).

Thus, in any sequence of hits and misses, swapping the order of adjacent hits and misses doesn’t change the sequence’s probability. And as a consequence, order entirely doesn’t matter. Applying any particular permutation to relabel the throw numbers (apart from the scene-setting fixed hit and miss of throws #1 and #2) leaves the probabilities unchanged.

Huh? Surely the coach has seen 2 makes (throws #1 and #99) and 1 miss (throw #2) at this point, not 6 makes and 3 misses. (Not that this changes your response significantly)

Regardless of any streaks of hits and misses, the player establishes a probability over an adequately long sample of shots. If the player, in all shots that have been counted (if sufficient to comprise a representative random sample), makes 5/8 of shots, the probability that anyone walking into the gym will see a hit is 5/8.

Caveat: Players are human, and never perform with consistent success. Tim Duncan, who has played 18 seasons, has recorded years in which his FT% ranged from .599 to .817, with no meaningful career trend overall upward or downward. So the probability that he would make one depends on what year you are watching him. As I recall, in one game last year, he missed ten or eleven in a row.

To clarify: the use of 5/8 in the OP was just as an example, there is no 5 and no 8 in the actual puzzle. In the actual puzzle, he’s 1-1, and then the coach leaves, and when the coach comes back, he’s taken 98 shots, but all we know for sure is that he made at least one and missed at least one.

Also, jtur88, your caveat is noted but obviously irrelevant for purposes of this puzzle.

Another interesting consequence of this is that, if we were just to focus on the number of hits and misses overall in a fixed number of throws (anywhere from zero hits all misses, to one hit and the rest misses, all the way up to all hits no misses), each possibility is equally likely (for each has the same probability 1/(# of throws + 1) overall, from the (H + M)!/(H! * M!) many sequences achieving it each of probability (H! * M!) / (H + M + 1)!).

Your answer is correct, 2/3.

I’m baffled by precisely how you got there, however. It’s an interesting observation that (for instance) HHHHMM is just as likely as MHMHHH, or any of the other sequences of 4 Hs and 2 Ms. And you skipped over entirely the even more interesting observation that the combined probability of all sequences of 4 Hs and 2 Ms is equal to the combined probability of all sequences of 3 Hs and 3 Ms, and similarly for any fixed number of Hs and Ms adding up to 6 (which, to me, is a REALLY surprising result… it’s a Pascal’s Triangle with a completely flat distribution).

But I don’t see how you get from there to “Thus, the question is as good as if the coach saw the player make throw #3 and was interested in the probability the player would then make throw #4. This would be 2/3 by stipulation (2 previous hits out of 3 previous attempts).”

[spoiler]
Aha, but I didn’t skip over that “even more interesting observation”; see my post made just in the nick of time above! :slight_smile:

As for how I got there: because sequences which are the same up to permutation have the same probability, we can conclude that applying any fixed permutation to re-label the throw #s leaves probabilities unchanged. Thus, “Given only that throw #99 is a hit, how likely is throw #100 to be a hit?” has the same answer as “Given only that throw #3 is a hit, how likely is throw #4 to be a hit?”, the latter question resulting from the former under the permutation which swaps outcomes for throws #99 and #100 with those for throws #3 and #4, respectively. And the answer to “Given only that throw #3 is a hit, how likely is throw #4 to be a hit?” is clearly 2/3, directly by stipulation.[/spoiler]

Sorry I was looking at the example and thought he’s seen the first 8 shots.

Are we to assess the probability the coach has got this correct, and the error in his assessment ?

Well to make it doable, I guess I have to assume the coachs KNOWS the rule.

The probability rule is a form of memory … the throws are not independent events. That means order is most definitely important. Consider MHMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM … then shot 99 to be a H was totally like winning the $2 Billion jackpot on your own…
By shot 99, law of large numbers applies… so the outcome of shot 99 is not much influence on the odds, and for time being it can be ignored. (This was the mistake )… We know that the random process will see small excursions from the true odds, but the results of 99 shots are extremely likely to be very close to the average.
(you know, p factor , chi … 90th percentile and other ways to describe how narrow the bell curve is. …)
Now, shot 3 … the coach has no information, and so the fact that shots 3 to 98 were taken, cannot tell the coach anything. But the coach does know law of large numbers has applied, so shot 99 can be taken to be far far less significant than to be able to change the result by 1/6th.

Now the results of 3 to 98 are unknown to the coach, but given that sways one way should cancel sways the other, the coach can ignore them … if he were think the third shot might have sent the probabilities toward hits , he would also have to think it might have sent the probabilities toward missing… the unknown can’t influence the known… So the coach expects the odds of shot 100 to be very very close to 50/50 … because shot 99 could only vary the memory , the probabilty, by one 2 ^ 50 or something tiny.

From the coach’s perspective, there is insufficient data. Three is far from being a valid sample size. If the coach was watching him flip coins, he could arrive at only two possible probabilities after three tosses: 66% or 33%, both of which would be proven wrong fairly quickly after a valid sample size of only a dozen or so. The only thing the coach knows that is correct is that it is not 100% and it is not 0%, unless rounded after more than 200 shots.

Ah dear… well there is a way JTUR is correct… We don’t know that the coach knows anything about law of large numbers or EVEN the basic probability a typical 14 year old girl can do. From the coaches perspective, he might have said "cei la ve, whatever will be will be ! ". But for the purpose of giving a meaningfull answer … correct the question, "The coach is also a Emeritus Professor in Mathematics , specializing in probability and information (Discrete maths.) "


I can model this at work on my programmable calculator tomorrow (and will have the time to do so); I suspect, over very large samples, that early chance fluctuations will lead to the early odds to either go high, or go low, and thus affect the subsequent odds, as in cause them to either skew towards 100%, or towards 0%. Thus, if the coach saw him make that late one, he should assume that the sequence skewed towards 100% (tho, of course, it isn’t going to be 100%).

No, the coach can definitely come up with an accurate and relevant mathematical probability, as being discussed by Indistinguishable and I in the spoiler boxes.

So if what you’re saying is true, the problem could also have been stated as “the coach left after two throws, came back and saw throw #57 only, which was made, then left again, then came back to see throw #125,234,437, what is the probability that it was made”, and the answer would STILL be 2/3?

If so, mind = blown.

Hmm… everyone credible seems to like the symmetry argument,
that shot 99 is just like shot 2…

BUT… the law of large numbers suggests that the probably of a hit in the next shot tracks as
0.5, 0.5 , 0.5, 0.5, 0.5, 0.5 , 0.5, 0.5, 0.5 … from shot 3 to 98.

So when Emeritis Professor and Coach sees shot 99, he can know its not significant, he knows he can’t say “wow, you must really have the odds skewed toward hits !” which is the thought that people use to get to 66%.

The Symmetry argument didn’t work with MEMORY… thats the one directional property of the order… you can’t just use arguments that work with ORDERED results, the results are affected by MEMORY too.