Probability of one team beating another based only on winning percentage

Something else I just thought of so apoligies if this has already been said in the thread.

You would need the number of games as well. I don’t think the probability for a 1-0 team beating a 0-1 team would be the same as a 30-0 team playing an 0-30 team.

True enough.

In this case, I am trying to determine probabilities of post-season playoff outcomes. There are 15 teams. Each team plays every other exactly once during the regular season. So each team facing each other in post-season play has played each other once, as well as all other teams once, therefore the same total number of games.

That might be worth delving into; separating the game between those two vs. all the other games. However, I would think that would have to be an empirical calculation rather than a theoretical one, determining the consistency of outcomes between repeated games between the same two teams. But because the team only lasts one season, essentially, the data isn’t there.

That should be incorporated into your measure of uncertainty, not your point estimate.

In his 1981 Baseball Abstract, Bill James introduced what he called the log5 method for determining the predicted win percentage of Team A over Team B. It is given by
Wpct = (A - A * B) / (A + B - 2 * A * B)

Where A represent Team A’s winning percentage and B represents Team B’s winning percentage.

This formula says a .600 team should beat a .400 team 69.2% of the time.

The site where I found this does not show how he arrived at this formula, but he apparently spent several pages describing and justifying it in his abstract.

It should have greater effect on the uncertainty measure, but it could still be used with estimate. We may not want to give 100% weight to the record, and instead give some weight to the “no-record prior”. For example, we could modify my formula to be:

Est_Prob_A_Beats_B = (1 + Num_Games_A_Won + Num_Games_B_Lost) / (2 + Num_Games_A_Played + Num_Games_B_Played)

This gives a “no-record prior” win-loss of 1-1. So, if we have two gameless teams playing, the estimate is 50% for each. If a 1-0 team plays a 0-1 team, the estimate is 75% for the winning team to win again, reflecting that one game doesn’t represent much information about either team. For a 30-0 playing a 0-30 team, we give 98% chance for the winning team to do so again, reflecting our greater confidence because of the larger history.

Interesting formula. I see two problems with it. First, it does not take into account how many games the teams have played. Win-loss records are much less informative early in the season. This isn’t a big problem if the formula is only used when both teams have a long record.

The second problem is illustrated by your example. Team A has a 60% win record and team B has a 40% win record. That’s the same as team B having a 60% loss record. It seems strange to predict that a team that wins 60% of the time playing a team that loses 60% would have a 70% chance of winning. Why predict something different when both records predict the same thing?

Working backwards from this formula, I it can be re-written as

A*(1-B)/(1-(AB+(1-A)(1-B))

Which is

(Prob A wins and B loses)/(1-P(A and B both win or A and B both lose))

which is basically the game suggested by Pasta in post 16 scenario 1
although his calculations for the 60/40 split don’t seem to be right.

In answer to Pleonast’s concerns. In order to make use of the length of the season, you need to begin with some assumptions about the underlying probability of a team having a win liklihood and use the observed record as an estimate of that, ie Bayesian, which is fine, and may make sense in this case, has the disadvantage of requiring extra assumptions. For the frequenctist point of view you just use the best data you have in front of you and go at it, in which case your best bet is to assume that the observed win/loss records exactly reflect the true ones

Regarding the win 60% vs lose 60%, you must remember that a team that loses 60% of the time is a below average team. So it makes sense that a team that wins 60% of the time against the average team, would win more than 60% of the time against one that is sub-par. Similarly a team that loses 60% of the time against an average team, would lose more than 60% against a good team with a 60% record.

Incidentally there is information that can be used to distinguish between the various game models, namely a look at the distribution of team records. If there was a real ranking of the teams in which a higher ranked team always beat a lower ranked team, then you would expect the distribution of records to be uniform. If on the other hand, there was no real ranking of teams and the whole thing was a crap shoot, then the team records would be distributed in a binomial fashion.

I actually have a model in mind, but its so statistically dense that it wouldn’t be helpful to anyone not well versed in statistics, but if anyone wants to see it I could put it up.

It is certainly possible; I did them quickly. However, keep in mind that “Team P60” does not have a coin with p=0.6. On the contrary, to win 60% of the time against a “uniform” field, you need p=0.647. (That is, you need p such that 0.6 = Integral[sub]0[/sub][sup]1[/sup] { dq p(1-q) / [ p(1-q) + q(1-p) ] } .

That is interesting. I would not interpret a 60% win percentage as “wins 60% of the time against the average team”, but as “wins 60% of the time against an unknown team”. I don’t think either interpretation is wrong–my formula reflects my interpretation and the other formula represents the other.

Go ahead and post it.

Here is a Baseball Prospectus version of a simulation. Good luck with it.

Pasta:

You are correct that the formula doesn’t actually use the correct winning percentage, instead using what amounts to the probability of beating a team with a 50% win likelihood. I didn’t bother checking to see whether this worked for a distribution. Of course we have no idea whether uniform is the correct baseline, but then again for this problem we are going to have to make unwarranted assumptions so why not. Still as I show below there might be some way to use the observed distribution of team winning percentages to figure out what this should be.

**Pleonast **
You are partially right for the reasons suggested by pasta, that the win percentage should be versus a team drawn form a distribution rather than versus a team with a 50% record. However, no matter what the distribution is, a team with a win percentage of 60% should be better than a randomly picked team (other wise its record would be 50% or less), and so a 40% team should have less chance of beating it that it would a random team (which it beats 40% of the time) and so the probability of its victory should definitely be less than 40%.

ultrafilter You asked for it don’t say you weren’t warned:
**
Notation: **

I will use** Phi(x;s)** to indicate the cumulate normal distribution of x in a normal distribution with mean 0 and variance s, so Phi(0;1)=0.5, Phi(1.96;1) will be 0.975

I will use InvPhi(p;s) to denote the inverse of this function InvPhi(0.5;1)=0, InvPhi(0.975;1)=1.96

**
Assumptions:**
I assume that each team has an underlying average number of point that it scores in a game with and that this has distribution F. This seems reasonable.

I further assume that if a team with an average points equal to x meet a team with an average number of points equal to y, then the score1-score2=x-y+z where z is a symetric random variable drawn from a distribution G with mean 0. If this is greater than 0 then team 1 wins if it is less than 0 team 2 wins.*This is a big assumption and assumes that there are no relative strengths or weaknesses or increases in variability between teams. *

I also assume that there is enough history such that the the observed winning percentage is equal to the exact winning percentage. There may be more calculations that can take this into account, but its too much trouble to bother with now. Bayesians can eat my shorts.

For this exercise I will make the final assumption that F is distributed as N(0,1), while G is N(0,s) for some unknown varaince s. For small s will indicate that luck has little role and a good team will almost always beat a poor team. Large s will indicate that the results are largely random and each team will *Another ginormous assumption. The distirbution F doesn’t matter too much and only does so relative to G, but assuming that they are related in this way requires some faith. Other models can be assumed but the calculations are easier for normal. *
Results:
Probability of with win percentage A beating team with win percentage B is
Phi(InvPhi(A,1+s)-InvPhi(B,1+s),s)

Where s = [1-Var( InvPhi(p_i) ) ] / Var( InvPhi(p_i) ) for p_i the observed winning percentages for all teams in the league.

**
Proof:**

Given a baseline average of x for a team, the proability A of winning against a random team will be F*G where * denotes the convolution. For the models we are using this will be Phi(x;1+s). So if we observe a team with a winning percentage A, then the average point count of that team will be InvPhi(A,1+s).

If we have a team with point count x vs a team with point count y, then the probability of team 1 willing is P(x-y+z>0)=P(z<x-y)=G(x-y). So if we have a team with record A vs a team with record B, then the probability of team one winning will be Phi(InvPhi(A,1+s)-InvPhi(B,1+s),s)

This is ok so far, but we still need to know what s is.

Let us look at all of the teams in the league, and suppose that V§ is cumulative distribution of winning percentages. Since the underlying score is distributed as N(0,1) accross the teams,
we know that for p~V, that
InvPhi(p,1+s)~N(0,1), so
InvPhi(p,1)/sqrt(1+s)~N(0,1), so
InvPhi(p,1)~N(0,1/(1+s)),

So all we have to do is to look at the distribution of InvPhi(p,1) for all observed winning percentages p, and we should end up with a normal distribution with varaiance (1/(1+s)) from which we can solve for s. Of course if we get a distriubtion that looks nothing like a normal then that will indicate a problem with our assumptions.