Baseball question. Does run differential indicate if a team is better or worse than their record?

Looking at the AL West standings on ESPN, I see that the Astros have a +113 run differential. The Rangers and Angels are one game ahead and behind the Astros. They have run differentials of +11 and -7 respectively. Does this indicate the Astros have had exceptionally bad luck, or does it not really mean much at all? For the record, I’m an Astros fan, and it does seem like they’ve had bad luck this year.

Well, it means nothing in terms of tiebreakers, like in the NFL or soccer. If two teams end the season in a tie and that tie needs to be broken to determine who goes to the playoffs, then they play an extra game to figure that out.

As for general statistic purposes, there’s something called the Pythagorean expectation, which is S[SUP]2[/SUP]/(S[SUP]2[/SUP] + A[SUP]2[/SUP]), where S is runs scored and A is runs allowed (over the course of the season). This formula gives you the number of expected wins. If the number of expected wins is much lower than actual wins, there are many possible causes, one of which is luck, but I don’t think you can say for sure.

Short answer is yes, typically. Run differential will be a better indicator of a team’s actual quality (and a better predictor of their future results) than their winning percentage.

The team’s record is what it is. Run differential DOES give you a hint as to how they’ll do in the future, the Blue Jays being an example this year. Their 50-51 start was flukey.

Even better would be to also figure in a team’s basic stats (homers, walks, doubles, K’s) vs. that of their opponents. Baseball Prospectus does exactly that-what they call “1st Order” win % is Pythagorean, 2nd uses the base stats, 3rd figures in to that the quality of one’s opponents. Note that the Indians, for example, look much better in 3rd than they do in 1st.

Am I reading this incorrectly, or would that formula give the expected percentage of wins?

Yes, pythag gives the expected winning percentage, not number of wins.

“You are what your record says you are.” - Bill Parcells

A good policy for players & coaches, to cut down on excuses, but not very accurate otherwise IMO. A 6-10 team with a tough schedule that outscores its opponents but loses 4 out of 4 overtime games is likely a better team than one that wins a weak division at 9-7 with a negative point differential.

Think of it in Bayesian terms; how much information do you have to guess how good a team is? Their run differential is part of it.

There is now, however, no point in worrying about it. Texas had a bad run differential but they won their division so they get to keep playing. Their run differential suggests they are an underdog, but frankly, everyone figured they were anyway, and in a short playoff series anyone can win.

There’s also the fact that run differential (and W-L record) might not reflect how a team is constituted NOW - Texas and Toronto being two very dramatic examples. For most of the season those teams didn’t have Cole Hamels and David Price, but now they do.

As to whether the difference between your record and your expected record is luck, yes, it’s usually just luck, and if at the halfay mark a team is REALLY unlucky or REALLY lucky, you should bet on them going the other way and if someone will take your bets you will make a lot of money in the long run. But in truth, over a 162-game season, not very many teams are far from their projected record; the breaks even out. Here are the ten playoff teams and how far they were from their expected record:

Toronto: 9 games under expected record (they went 93-69, project was 102-60)
New York Yankees: 1 game under
Kansas City: Even
Texas: 5 games over
Houston: 7 games under
New York Mets: Even
St. Louis: 4 games over
Pittsburgh: 5 games over
Chicago Cubs: 7 over
Los Angeles: 3 over

Toronto is the biggest outlier, and Texas and Houston are sort of switched from where you’d expect, but as you can see teams are basically as good as you’d expect. Only two teams, Toronto and Houston, are SUBSTANTIALLY under their prediction, which of course makes total sense because of course the teams that make the playoffs are usually going to be the ones who got lucky, not the ones who got unlucky. But anyay, no team gets outscored by 120 runs and goes 92-70. It just doesn’t happen.

So is there a common thread between teams that are way better or worse than their projected record? I don’t have detailed numbers in front of me but here is what you would expect from a team that UNDERPERFORMS its projection:

  1. Very hitting-dependent
  2. Shitty bullpen

A heavy hitting team has the opportunity to really jack up their projection winning games by stupid scores like 13-4 and then blowing the close ones with a lousy bullpen. Every additional run you score has less value than the last one. If you score 13 runs you’re almost certainly going to win and if you score 15 runs you’re really not much more certain to win; the ability to pile on a beaten team is not as valuable as the marginal difference between 4 runs and 5.

Who underperformed the most of the playoff teams? Toronto, by nine games. And what kind of team are they? Well, gosh, wouldn’t you know; they’re the best hitting team in recent baseball history, and they had a terrible bullpen for most of the first half of the season. That is not a coincidence.

A team can lose their first game 22-0, then win 21 in a row by 1 run. They’re 21-1, getting outscored by a run. There too many other deeper, reliable measures to assess a team.

First level is W/L record.

Second level is Pythagorean/Run differential, where you use runs scored to gauge the quality of the team.

The third level is where most of the good analysts/websites lie. In the third level, you look deeper than runs to the underlying events. Did a team string together its hits in an unlikely fashion? If so, their runs scored would lead you to believe they had a more potent offense than would be expected in the future.

Fangraphs uses BaseRuns, which generates an expected win total from the sum total of events that happened to that team. From this, you can see which teams were relatively unlucky (Astros, Nationals, Blue Jays) and which were lucky (Cardinals, Royals, Pirates).

The run differential only shows the difference in runs. It assumes a bell curve distribution of runs, but if a team does not have a bell curve, then it’s meaningless. A team that wins close games, but gets blown out in losses is going to have a misleading run differential.

This can happen when there are three ace pitchers, and two dogs.

However, the Astros underperformed by almost as much and had a mediocre-to-good offense (5th in the league) and good and deep pitching (1st in the league, with both a depth of good-to-decent starters and a good bullpen, albeit one that was bad in September).

This thread was started two weeks ago and the last reply was 10 days ago. Therefore, I hope to interject my own question about earned runs without hijacking.

Scenario: Start of an inning—out, out, runner reaches on error. The pitcher could now allow infinity+1 runs in the inning and they would all be unearned; the theory is that the error should have been the third out and the pitcher should never have had the opportunity to give up those runs.

Scenario: the away pitcher gives up an unearned run in the bottom of the ninth, tying the game, sending it to extra-innings. Shouldn’t any run given up by any pitcher in extra-innings be unearned, the theory being that the error in the bottom of the ninth should have been an out that would have ended the game, not allowing any pitcher to give up any runs in extra-innings? I know the rules state otherwise, but doesn’t it make sense?

While this is true, run differential will give you a better prediction of future success than actual wins will. Which doesn’t change the fact that success is judged by wins at the end of the day (or season).

It would not be consistent with the manner in which runs are assigned as earned or unearned (or for that matter in terms of how baseball is played) which is to take every inning as its own universe; things that happen in previous or subsequent innings aren’t relevant.

If you follow the logical path you’re on why would you stop at extra innings? Suppose the Blue Jays are up in the top of the eighth and this happens:

Colabello strikes out
Tulowitzki singles
Martin pops out
Pillar reaches on a fielding error, E-6. Tulowitzki to third.
Goins reaches on a fielding error, E-3. Tulowitzki scores. Pillar to second.
Revere grounds out 4-3.

Okay, so the run is unearned because Tulowitzski wouldn’t have scored had the third out been made on Pillar. Now the ninth inning;

Donaldson singles
Bautista home run to LF, Donaldson scores, Bautista scores

Why’re those runs both earned? Had the two errors not happened in the eighth, the ninth would have seen Goins, Revere and Donaldson hit, and we don’t know if Bautista would ever have come up. EVERYTHING that happens after an error is irreversibly altered by that error happening.