True as far as it goes, but you do have reason to think some particular true percentages are more likely than others. If someone makes the one and only free throw he’s attempted so far, his true percentage is a lot more likely to be the league average than it is to be 100%.
What are you going on about?
Well, is there any reason not to think the “true” league average shooting percentage is close to the average shooting percentage from my sample of all players?
The sample size for an individual player might be too small to trust that their measured FT % reflects their true FT %, but for the league as a whole the sample is much larger.
I think the last two (on topic) posts might have missed what Quercus is saying. I’m pretty sure what he said agrees with what you guys are saying (and what I was saying). That is, one can use the full sample of players to learn about the underlying true distribution of percentages. This distribution, and a little integration, is all that is needed to rank players.
Quercus first paragraph was not a statement of what one should do but rather a Devil’s-advocate setup for his second paragraph.
Since there seems to be some confusion, why don’t you just explain what these sentences mean:
“If you want to know who is likely to sink more free throws, you’ve just got to watch them try to sink free throws. No math formula is going to take its place.”
Are you suggesting that it is feasible to “watch them”? Are you suggesting that you watch all 1230 games each and every year and are capable of determining who, given any two players amongst the entire NBA rosterdom, is more likely to hit a FT than the other?
I am saying that if you want to rank players, you’ve got to have data on their performance. And that means having a robust number of data points. If you don’t have that, you cannot just meditate on the idea of professional basketball and come up with a math formula that will allow you to rank the players in the absence of a sufficient number of data points.
What are you saying? That you have some kind of basketball ESP that allows you to determine who the better player is without having data about their past performance?
Kimmy, no one is talking about not using datapoints. Extrapolating from data is exactly what I’m trying to do.
What you seem to be saying in this post, near as I can tell, is “Unless you have so much data on every player’s FT shooting that you can be confident his true FT % is virtually identical to his measured FT %, it’s impossible to say anything about how good a FT shooter he’s likely to be.” This is untrue, as Pasta’s excellent suggestion above demonstrates. Obviously more data is always better, but that doesn’t mean we can’t say anything without it.
Extreme example: Let’s say I’ve measured the FT % of every player in the league. Let’s say the league average is 70%, and the best player in the league is at 90%. Some new player joins the league. I have zero datapoints on the new player. I can still say with quite a bit of confidence that he’s probably worse than the guy with the 90% FT%, just based on what I know about the league-wide distribution of FT%.
But what you cannot do, and what you hope to do, is come up with an average for a player for whom you have a few, but not enough data points.
Let’s say the league average is 1/2 and you have a player for whom you have three data points, two of which were successes. Your best estimate of the proportion of free throws he will sink is 2/3. Naturally, this statistic will have larger confidence intervals at all confidence levels than those calculated for players on whom you have more data.
And this is where you must rest, because you cannot do better than this. But, it seems you think you could if only you could craft some model, perhaps some weighting of the league average and the observed proportion, then you could happen upon the “real” proportion. There is no mathematical justification for this misguided faith.
If you have only three datapoints, guessing their true FT% is close to the league average would make a lot more sense than guessing their true FT % is close to their measured FT%.
If we watch a random NBA player sink three shots, and I say “I bet his FT% is 70%” (or whatever the league average is) and you say “I bet his FT% is 100%”, I’ll be closer to the truth more often than you. Say clearly I can do better than just taking his measured FT %.
Obviously, you can have more confidence that your guesses are good if you have more data, and no one is disputing this.
Everything drowns in the problem of the priors, like most statistics, so far as I can see. And it’s not like you’ll find One True Prior Probability Distribution to work from. As I see it, priors are choices we make for how to analyze a situation, not data we discover empirically.
Shouldn’t this come out to 0.5P(obs|r[sub]1[/sub]>r[sub]2[/sub])/P(obs)? What happened to the denominator? And, of course, P(obs) is the tricky term; we can expand it out to the sum/integral of P(obs | r[sub]1[/sub] = …, r[sub]2[/sub] = …) * P(r[sub]1[/sub] = …) * P(r[sub]2[/sub] = …), but we’re left in worries about the prior distributions of r[sub]1[/sub] and r[sub]2[/sub] (presumably identical).
You may not be able to find the “real” proportion, but you can combine information about the league averages with information about a player’s record to get a reasonable estimate. See T. Herzog’s “Introduction to Credibility Theory” for a very detailed treatment.
Ah, yes, of course. Apologies for my haste. Yes, the missing demoninator P(obs) can be calculated with an integral nearly identical to the one I included.
It does come down to determining a prior distribution, but I think the large collection of data at hand (all players) can reasonably be used to inform our prior.
In the limit of infinite players, we could use only those who shot the most often to construct the underlying distribution. However, the resulting distribution would then be good for ranking everyone, including those who only shot once or twice.
In reality, assessing and propagating uncertainties on D(r) due to the (finite) data we use to inform it allows us to estimate uncertainties on the rankings. Or better, we can incorporate the assessed uncertainties on D(r) into our definition of the prior and integrate those out as part of the P(obs) calculation.
For the heck of it, here is the distribution of (FT made)/(FT attempted) for all player-years from 1990 to 2007 for which the player attempted at least 75 free throws:
A quick estimate of D(r).
ETA: I just slapped sqrt(N)-derived errors from the bin contents on the curve as a quick-n-dirty example of how one can also assess uncertainties on D(r) which can be integrated over when calculating P(obs|anything).
You run into a big problem with a scheme like this. Any given confidence interval contains the true value of the quantity being estimated with probability 1 - a. Therefore, if you fit n confidence intervals independently, the probability that they all contain the true value is (1 - a)[sup]n[/sup]. For sufficiently large n, this is a small number, and it’s very likely that you’re dealing with bad numbers somewhere.
This doesn’t have anything to do with the OP given the assumptions that are implicit in the thread, but while I’m playing with data… Here’s how the mean FT success rate varies with how often a player is at line, showing an expected trend (since attempts and success rate will both be positively correlated with how good the player is).