Probability of one team beating another based only on winning percentage

In a lot of situations like this, consider the extreme cases first. E.g., team A is 5-0 and team B is 0-5. While we can hope that team A has a better than 50/50 chance of beating team B, we certainly would never rationally believe that their chances were 100%. I can’t conceive of any rationally derived formula that would give a number somewhere within that range. There is just not enough information.

OTOH, if you had the entire DB of MLB outcomes, then one could come up with an empirical formula for such situations, but that would be data-driven rather than math-driven. In particular, a comparable formula for the NFL would not be the same.

Let me add some information that distinguishes my situation from typical pro sports. I am trying to figure out, purely as an exercise, if I can assign probabilities to playoff outcomes based on regular season play. The league has 15 teams, and each team plays every other exactly once during the season, which is different than any college or professional playoff scenario. Therefore it makes (a little) more sense to compare records, which you can’t really do to predict the Rose Bowl or the World Series.

I acknowledge that in post #6, which is why I know that my model can’t be correct, not even from a purely probabilistic standpoint.

The problem you’re going to run into if you look at common opponants is that superiority is not transitive, thus A beating B anbd B beating C does not imply that A will beat C. But let’s assume that you merely want the probability so there’s some gray area. How does score enter into it? A beats B by 1 and B beats C by 1. Will that be that same probability of A beating C as A beating B by 40 and B beating C by 50?

Another problem will be if there are no common opponants. How many times do we see a 28-2 mid-major set stomped by a 18-10 large conference team in the NCAA tournament

Have you tried to get it? I agree with you about models being flawed, but the thing to do when you don’t have all the information is not to just give up and build a weak model.

Quoth Pasta:

To put this another way, which team wins or loses, in general, depends both on some factors inherent to the teams (which we can call skill), and on some factors not inherent to the teams (which we can call luck). In a game of pure skill, the better team will always, 100% of the time, win. In a game of pure luck, it’s always a 50-50 tossup, and any differences in the prior records are just a fluke. And different sports have different mixes of the two. Now, with a large enough base of detailed enough statistics, you probably could come up with a quantification of skill vs. luck for any given sport, but we don’t have that. And even coming up with such a figure for baseball in general probably wouldn’t be much use here, since I imagine that little league baseball works differently than professional.

It’s just an intellectual exercise with no real payoff so I’m not willing to jump through hoops to collect data.

I agree, though skill itself is not consistent. Some days a pitcher just can’t throw a strike to save his life, hitting slumps, etc. These are not elements of luck but rather the human nature of skill, and you can’t predict that statistically, AFAIK.

An understatement. :smiley:

Do you mind if I attempt abstractifying the hell out of your question?

Suppose there is a set of n teams, and for each pair of teams A and B, there is a set probability P(A, B) that a game between A and B will result in a win for A. (So P(B, A) = 1 – P(A, B); there are no ties.) You could even imagine a “game” between A and B as generating a random number from 0 to 1, and A is declared the winner if that number turns out to be < P(A, B).

Now, for a given pair of teams A and B, suppose we don’t know what P(A, B) is. But we do know what percentage of games each team has actually won over the course of a season, in which each team has played every other team the same number of times.

Under these conditions, is it possible to determine, or make a reasonable guess of, or say anything about, what P(A, B) is?

Is that a fair statement of what the OP is asking?

Precisely.

One way to do this would be assume that the number of points scored has some distribution for each team. To be abstract, suppose there is only offense and no defense (like the NBA :-). Then we could use your P(A,B) to determine the probability distribution of the number of points scored for each team and compute the probability that one team would then beat another. You would also need to know the overall average of points scored as well as a normalization.

I believe this paper by Kaplan and Garstka is about this technique

https://netfiles.uiuc.edu/shj/www/JK_NCAAMM.pdf

warning pdf.

Well, the answer to the question in Thudlow Boink’s setup, as given with no further assumptions, is that you can’t conclusively conclude anything about P(A, B) without further information. There’s nothing forcing P(A, B) to be related to P(A, other teams) and P(B, other teams) in any way, in that setup, so you aren’t given any information about P(A, B).

[For that matter, A and B’s track record against other teams so far could diverge arbitrarily far in win frequency from the expected values P(A, other teams) and P(B, other teams), so you can’t make any hard conclusions about those values either, but let’s leave that to the side…]

[Now, if you model this with a meta-(probability distribution), so that you can ask “What is the probability that P(A, B) = whatever?”, then you can make probabilistic claims about the values of P(A, B) accordingly, once you’ve picked out a particular meta-probability distribution of interest…]

I’m joining the chorus of those who say that more information is needed if you want a meaningful answer. You need to know the luck-to-skill nature of the game.

Assuming a large number of games have already been played, the difference in percentages shows that skill is involved. Knowing only the percentages, the best model you can choose would be “100% skill”. That is, the team with the better record can be assumed to always win.

If you knew not just the percentages but the actual lists of results against each other team, then you make an estimate of the luck-to-skill ratio of your sport. If team A beat every team that team B beat, and team B lost to every team that team A lost to - then you could use the 100% skill model. If there were “inversions” you could use those to estimate the percentage of luck involved.

Well, yes. But I’m not asking how to get a great answer, I’m asking what kind of answer we can get with the data at hand.

I disagree with this. Even in contests involving no luck, like chess (correct me if I’m wrong, but there are no bad bounces, no bad umpiring calls, no injuries, no weather impact), there are match-ups where a given player doesn’t always beat another given player every time.

And you could do even better if you also had the outcomes of the games played by all the other teams. But the volume of data is now starting to get pretty large.

There is luck in chess. At any given moment, there are some moves that are better than others. Ultimately, even a very skilled player can’t always tell which moves are better, and must just guess which one will be best. Sometimes, you guess right.

Besides, there’s definitely some factor that leads to matches between two players not always going the same way, and we might as well call that factor “luck”.

Is it possible to make some sort of Bayesian estimate based on the records of the two teams?

It doesn’t matter that one could make a better probability estimate if more information was available about the teams. The OP is asking what’s the best estimate we can make knowing only their records. My guess is something like this:

Est_Prob_A_Beats_B = (Num_Games_A_Won + Num_Games_B_Lost) / (Num_Games_A_Played + Num_Games_B_Played)

This formula takes into account that we have more information about a team that has played more games. And no information about a team that has played none; in which case, the formula becomes the win percentage of the other team.

True but if A beat C in its first and only game of the season and B lost to D, this predicts a 100% chance that A beats B. To be Bayesian you really need some prior. This isn’t Bayesian (It could be possibly if we figured out what prior distribution implied this I suppose) it’s a heuristic.

There’s a related question that may be possible to figure out. Namely, if the result of the A vs B game has already happened. You’re asking if A and B have winning percentages so far, what’s the probability of a future result. That’s not easily quantifiable. But if you instead ask about a past result, a result that the winning percentage already includes, then you can answer the question.

For example, if you know that the '72 Dolphins won every game of their season, then you know that the P(Dolphins>Opponent) is 1. But if you only knew that they were undefeated going into their last game of season, and they’re playing a .420 team, then the prediction is harder.

Also, don’t assume that each team plays against a league that’s .500. If they’re taking in a lot of wins, then there are less to go around for the other teams, so the rest of the league might be only .335, for example.

Since this is in GQ rather than GD, let’s try to make things concrete. Suppose I said “Well, I think the correct probability is so-and-so.” What hard criteria might lead you to say “No, that’s incorrect” or “Yes, that is indeed correct”?

My formula is not strictly Bayesian, but it does make use of prior information. In your example, a 1-0 team plays a 0-1 team. I think estimating A has a 100% chance of beating B is not unreasonable. But we might want something that doesn’t assume the records are perfect predictors.

That is the real problem. I’m not sure there is an easy way to test the accuracy of probability estimates.

We can look at the mathematical properties of our formulas. For example, the probability of A winning plus the probability of B winning must be unity. Both Cooking’s formula and mine meet that criterion. Mine has the additional property that it gives less weight to the records of teams with fewer games. I’m sure there’s other properties that we might want.

It’s not that it would be incorrect, it’s that there’s no way to say that it’s correct. What I could say is, “That MAY be right, but I have just as much basis for thinking the probability is actually thus-and-such. See, look, here’s a model that describes each team’s chance of winning any given match-up, and it’s entirely consistent with the results so far, and predicts a thus-and-such chance of A winning. But I also have this other model, that’s also completely consistent with the results so far, that predicts a whatsits chance of A winning. There’s no way to tell, just from the results so far, whether either model is correct.”

Take a really simple example with three teams: A beat B and C, and B beat C.

Now, we could assume that the best team always wins, so predict A will beat B 100% of the time.

BUT, we could also assume the winner is a completely random coinflip. Note that the results we have are the most likely outcome of random coinflips, so there’s no statistical reason to rule out complete randomness. In this case we’d predict that A vs. B is literally a toss-up.

Neither answer is conclusively wrong, but neither answer is conclusively right, either. And, without knowing anything about the game, there’s no mathmatical reason to choose one over the other (or something in between).