Statistical Analysis, Baseball, and Unbridled Hostility

Armstrong seems to be saying that “old school” managers like Casey Stengel and Earl Weaver would have scoffed at the use of stats. If I’m reading him right, and if that IS what he’s saying, he’s an even BIGGER idiot than the OP thinks.

Earl Weaver was a PIONEER in the use of stats! He was not an old school guy playing hunches- he studied stats religiously, though he usually had to rely on low tech methods (like index cards). He studied tendencies, he knew which batters fared best against given pitchers, and was constantly using numbers to give himself and his team an edge. If Weaver were managing today, he’d have been among the first managers with a laptop.

Moreover, MOST of the “new” stats thrown around by baseball geeks are not complicated formulas. You’re free to argue that some aren’t really good for much (I think VORP is a waste of time), but you CAN’T argue that they’re so complicated as to require a Harvard degree, and they’re NOT so abstract as to detract from anyone’s enjoyment of the game. Anyone who’s smart enough to calculate a pitcher’s ERA is smart enough to understand what WHIP and OPS are.

Moreover, MOST of the new statistical concepts are just ways to adjust for simple facts that any Little Leaguer can understand. When you played Little League, how many times did your coach yell encouragingly, “Good eye, good eye! A walk’s as god as a hit, a walk’s as good as a hit”?

Little Leaguers can understand that a walk’s as good as a hit (well, not QUITE… if you’re down by a run in the 9th with 2 out and a man on 3rd, a walk is NOT as good as a hit)… but “old school” writers don’t grasp that. They don’t get that a .300 hitter who rarely walks is a worse leadoff man than a .270 hitter who walks a lot.

And what’s really screwy is, Joe Morgan, who was BRILLIANT at drawing walks, is one of the commentators quickest to dismiss the value of walks! He was a magnificent baseball player who doesn’t appreciate the things he himself was so good at!

Now, I consider myself a conservative, a traditionalist and an “old school” guy in countless ways. It would be absolutely fine by me if Joe Morgan (or Rick Reilly or Bill Platschke or Murray Chass or Armstrong or whoever) would stand up and say, “BABIP is a bullshit statistic, and here’s why…” or “VORP doesn’t really prove what stat nerds think it does, and here’s why…” or “I’ve read Moneyball, and I think Billy Beane’s approach is fatally flawed because…”

But there never seems to be a “Here’s why” or a “because.” Hatred of newfangled stats invariably comes down to “I don’t understand it, so it must be stupid” or “I know plenty about baseball already, and I don’t need to learn anything more.”

Murray Chass is a perfect (and sad) example of what I mean. He summed up his feelings about new stats by saying:

“To me, VORP epitomized the new-age nonsense. For the longest time, I had no idea what VORP meant and didn’t care enough to go to any great lengths to find out. I asked some colleagues whose work I respect, and they didn’t know what it meant either. Finally, not long ago, I came across VORP spelled out. It stands for value over replacement player. How thrilling. How absurd. Value over replacement player. Don’t ask what it means. I don’t know.”

Now, I happen to AGREE that VORP is a pretty useless statistic, but I still find Chass’ attitude absurd. In what other field would a journalist be so… so PROUD that he doesn’t know anything about major trends? In what other field would a reporter proclaim so happily that he’s an ignoramus AND that he’s too lazy to do a teensy weensy bit of digging to learn a little bit more?

Worst of all, the people who are quickest to dismiss stats are usually hung up on stats themselves. I GUARANTEE that Armstrong and Chass know how many games their favorite pitchers won last season, how many RBIs their favorite hitters had last year, and what their favorite players’ batting averages were. Why doesn’t anyone call THEM nerds or geeks for knowing and caring about such numbers?

Imagine how people would laugh if I wrote “Babe Ruth was overrated. Mario Mendoza was much better. What? Ruth had more homers, more runs, more hits, more RBIs and a higher average? Don’t bore me with numbers, you geek. Statistics are for losers who live in their parents basement. Mendoza PLAYED THE GAME THE RIGHT WAY, and DID ALL THE LITTLE THINGS THAT DON’T SHOW UP IN THE STATS.”

Thing is, even the most casual fan KNOWS that wins and losses are stupid way to judge pitchers. NOBODY thinks Jim Merritt was a Cy Young candidate in 1970, despite his 20 wins, because his ERA was over 5! He was a terrible pitcher who won a lot of games because he was lucky enough to play for the Big Red Machine, a team that usually scored a lot of runs.

Modern stats are just a way of getting around problems like that. A pitcher’s wins are determined at least as much by the quality of his teammates as by his own talent. A slugger’s RBI’s depend on the quality of the hitters ahead of him, and a leadoff man’s runs scored depend on the talent of the batters behind him. New stats are designed to help us figure out, “Is this guy REALLY good, or are his numbers inflated because his teammates are so good?”

Again, that’s NOT a hard concept to understand. It’s so simple, even a sportswriter should be able to grasp it.

Because it so often seems that statheads are simply, obtusely, missing what the game is all about. “We do not always measure what we value, but we value what we measure”, as the management consultants say. Having figures available for crunching can lead to a mindset where the crunched numbers are seen as actually reflecting what is happening on the field, where not having them leads the unmeasured things to be dismissed or ignored.

The game is both fun and beautiful to watch, adn that should be its purpose - but where are the stats for fun and beauty? Statheads get derided so often, including by me, for seemingly lacking the perspective to appreciate what matters most about being a fan of the game, in fact for seemingly deriding it themselves.
“Baseball is not statistics. Baseball is DiMaggio rounding second.” - Jimmy Breslin

Because if you want to know whether Joe Blow is a better player than John Doe, fun and beauty won’t give you the answer. But in any case there’s no reason why you can’t have it both ways. Knowing the stats doesn’t make you incapable of appreciating the esthetics of the game. And outright dismissing the usefulness of statistics wouldn’t increase your enjoyment of the game either.

I have two points to make.

  1. I wouldn’t get too worked up about what some hack journalist thinks. Sports “analysts” generally just repeat each others’ same old hackneyed analysis and demonize anyone who dare apply a novel thought and come up with an intelligent analysis.

  2. I generally like statistical analysis in baseball, WHIP is a spectacular analytical tool, for example. That said, sometimes it seems like Sabremetricians forget that as well as baseball lends itself to being quantified statistically, statistics can’t reflect everything that happens in the game. Therefore, I think some people get a bit annoyed when Sabremetricians insist that their explanation is the only one.

But that’s because you don’t know what you’re talking about. It’s your observation that’s wrong; there are no “Statheads” who don’t appreciate the game’s beauty. Can you point to any human beings you’ve ever encountered who actually are obsessed with baseball statistics but don’t love the game of baseball and are fans of the game, first and foremost?

Of course not. It’s bullshit.

Let me ask… do you think someone who’s studied music theory at Juilliard for years has LESS appreciation for Mozart’s symphonies than an uneducated fan?

Do you think a female botanist who’s spent years studying flowers is LESS appreciative when her husband buys her a dozen roses?

Do you think a master chef who’s spent years studying new cooking techniques and new recipes a good meal LESS than you do?

If so, you’re way off the mark.

I dunno, is he using a statistical analysis of Mozart’s music to “prove” that his music is actually derivative crap, and the people in the classical music industry, who have been praising his work (and giving him awards every year), are actually idiots who can’t tell good work from bad? That’s part of what makes traditionalists mad, the implication that their opinions, formed from a lifetime of involvement in the sport, are made null and void by a statistical analysis.

I’d also like to defend the Win as a statistic. You can cherry pick and find examples of mediocre performances that have great individual raw statistics, like wins, ERA and WHIP. You have 1970 Jim Merritt, with 20 wins who really wasn’t very good, I can give you 2006 Pedro Martinez. WHIP of 1.1, was that a good year? He also had a 4.48 ERA, averaged < 6 innings per start, earned 9 wins 8 losses, I would call that mediocre at best, regardless of what his WHIP said.

In any individual year, the Win statistic can be uninformative, but over a career, Wins has a tendency to separate out guys who can start 30+ games every year, pitch deep into games, and keep the runs down so your offense has a chance to score enough to win.

Seriously, I’ve heard people poo-poo the Win as a meaningless statistic that “only tells us the pitcher’s team was winning when he came out of the game.” :eek:

Isn’t winning the game the most important thing?

Winning is a great stat if your goal to determine if a team scored more runs than their competition. It is not a useful stat if you wish to determine how well a pitcher pitched, or how well he is likely to pitch in the future.

It tells you he held the opposing team to fewer runs than his team. If this happens more often than not, the pitcher can be assumed to be an effective pitcher.

This is why people get frustrated with statheads. They want to overthink everything, even something as simple (and as useful) as the W.

No, he can’t.

Last year, Mike Abel* had a 7-16 record. He held the opposing team to fewer runs than his team in fewer than one-third of his opportunities to do so. Would you hire Mike?

Or would you rather hire Beelzebub Jones (11-10), Joe Johnson (13-12), or Tommy Seasonkiller (13-8)?

According to the reasoning above, the three guys in the second group could be assumed to be effective pitchers. The first guy is an also-ran, an ineffective pitcher.

But I’ll take Matt Cain, and you can have Mike Mussina, Doug Davis, and Tom Glavine. Because it’s a lot easier to win more than you lose when your closer is Mariano Rivera, and your lineup is leading the league in runs.

    • not his real name

The existence of an exception to the rule, doesn’t mean the rule isn’t generally true. Take any single raw statistic you want and I bet there’s someone with “good” stats who isn’t that good, and someone with “bad” stats who is good. I gave an example above with Pedro, who had a stellar WHIP in 2006, but could not possibly have been considered better than 99% of the other starting pitchers in the league.

When did I ever say the W should be the only pitching stat used?

There’s no mystical “one stat to rule them all” and looking at Matt Cain, the only way you can prove he is a better pitcher than another is by saying “Well, he played for a terrible team, but all of his other stats say he would have gotten more wins on a better team.”

Maybe I’m just not invested enough in the stats, but to that I say “Well duh.”

Let’s say you have two medicines. One works 60% of the time. One works 90% percent of the time. Which one would you take? The fact that no one can develop perfect baseball stats, doesn’t mean that some aren’t better than others.

Justin, your reasoning is circular, because your conclusion (“he would have gotten more wins on a better team”) carries the implication that “wins” are how you’re measuring the pitcher.

Like many before me, I’ll try to explain this in an easy-to-understand fashion:

  • It is the job of teams to win games.

  • Individual players make contributions to wins by doing the things demanded of their positions. A hitter’s job is to contribute to as many runs scored as possible, while consuming the fewest number of outs while doing so. A pitcher’s job is to prevent as many runs as possible from scoring. All players (except the DH) also contribute fielding, which is a small but significant part of run-prevention.

  • Given that in order for a team to win, it has to outscore its opponent, a starting pitcher’s pitching is about 43% responsible for the outcome. The collective hitters’ run-scoring ability is exactly 50% responsible. Fielding and relief pitching accounts for the remaining 7% or so.

Obviously there is some correlation between a pitcher’s skill, and the wins his team gets, but it’s not always a very strong correlation. In fact, if you were trying to guess a pitcher’s win percentage, you’d get just as good a correlation by looking at his offensive run support, as by looking at his ERA. Every baseball season is littered with pitchers who don’t pitch very well but get lots of wins because their team scores a gazillion runs when they pitch. There are also many pitchers, like Matt Cain, who are among the very best at preventing runs, but don’t end up with “wins” to show for it because their offense can’t score.

It’s not a difficult concept. Wins are a team result. No player contributes more than about 43% to whether his team wins. Therefore, a “wins” statistic is a poor way to measure any individual’s contribution to team wins.

It would be like deciding the best chef in the world wasn’t really the best, because the restaurant at which he cooks has a terrible wait-staff and blares loud obnoxious music, and therefore only gets 4 stars instead of 5. It just doesn’t make any sense.

Cheesesteak, I wanted to reply to you specifically, because you raise some interesting (and even good! :wink: ) points.

First, the correlation between a pitcher’s ability to prevent runs, and how many wins he gets, is there, but not particularly strong. In a typical year there’s not just a single exception – lots of pitchers get “win” totals out of whack with their abilities because of the vagaries of run support. I think you’ll find that the correlation between WHIP and run-prevention is much stronger, and the exceptions you find like your Pedro example above are much rarer.

Also, regarding your point about lifetime wins: yes, as you add more and more seasons to the sample, the correlation between ability to prevent runs and “wins” grows stronger, because those vagaries of run support tend to smooth out over time.

But still, not always. Bert Blyleven is the most famous example of this: the poor offenses behind him, as well as the poor pitcher’s parks in which he worked, never evened out, and as a result he’s got about 30 fewer wins over his career than he (probably) would have, had he played for just an average offensive team in an average park.

I also want to to add that I agree with you that there’s a correlation between wins and innings pitched, because a pitcher who starts more games and throws more innings will get more decisions, both wins and losses. And since there’s obvious value in throwing lots of innings (unless the pitcher really stinks), that adds a little more value to “wins.” Still not enough, I think, to make it a statistic worth worrying about if you’re trying to decide if you want a guy to pitch for your team.

To get back to the OP, if you haven’t yet, go read Moneyball. It really gives a lot of insight into the cultural resistance to Sabermetrics.
In short:

I’d just add that added to that is the fact that making it to the big leagues is gaining membership in a very exclusive club (one that takes a lot of work to join), and so members of that club will naturally tend to overvalue membership in the club and disparage anyone not in that club. Journalists get this snobbery second-hand, but it’s still there.

Plus the whole ‘hate the pencil necked nerds’ thing, which journalists can buy into super-hard because they have to prove they’re not pencil-necks themselves.

Of course you really don’t necessarily have to think that Joe Morgan’s feelings are hurt by stat-heads to understand his hostility. You just need to know that Joe Morgan can’t understand stats, so if they’re widely accepted as part of baseball analysis, he’s out of a job. No real surprise that he’s against them, is there?

In terms of his win-loss record, why would the park he played in matter? If he played in hitter’s parks, then that would help his team as much as it helped his opponents.

And the extent to which he was saddled with bad teams has been exceptionally overstated. The Twins teams he played on from 1969 to 1976 weren’t bad; one division title, and overall a bit above .500 with some decent offenses. He played for Texas in 1977 when Texas went 94-68. Then he spent three years in Pittsburgh and they played very well, including winning the World Series. Then Cleveland from 1980 to 1985 and okay that was a pretty bad team, but after that mostly okay teams. In his entire career he only played on one REALLY bad team, the 1985 Indians, and he wasn’t there the whole season.

I’m left wondering what he DOES understand. Sunday night, I heard him cite the “tie goes to the runner” rule. Unfortunately Jon Miller said the same thing.

OK, fight my ignorance. Isn’t that the rule?