What stats are best for judging a baseball player?

While I’ve enjoyed this thread, it’s largely devolved into a debate between Gonzomax and others regarding how best to judge a ballplayer’s performance. From what I gather, Gonzo figures that Batting Average, RBIs, and Runs are the best indicators of hitting, and anything else is a manipulative attempt at justifying one’s pre-determined opinions (Gonzomax, please correct this typification if it’s incorrect).

I tend to embrace the evolution of statistics in baseball. Indeed, one of the greatest aspects of the sport is the degree to which it can be measured. However, I’m basically a casual baseball fan (my reverence for the game waned about 15 years ago; while I’m pretty well versed in the history, I’m not really up on the performances of the past decade or so), so I can’t make a particularly cogent argument regarding sabermetrics and other such modern analyses.

So, I pose it to the Dope:
What stats do you think best gauge a player’s performance (be it in the field, on the mound, or at the plate)?
Please explain how those stats are calculated (OPS, for example, is lost on me), and please explain the benefits (and limitations) of your preferred measure.

There are more statistical evaluations of baseball players than there are baseball players, I think.

I would say that the current leader in the field is VORP, or Value Over Replacement Player, invented by a guy named Keith Woolner.

This is hard to describe, but it values a player statistically based upon how much better he is than a guy you would likely bring in to replace him (that’s a terrible definition.) A “replacement player” is something less than an “average player” because an average player is really quite good, i.e. there are more below average players in baseball than above average ones. Because professional baseball players operate at the top margin of all players, they do not fit into a normal distribution curve of skill.

What it measures statistically is how many more runs an individual player is reponsible for vs. a replacement level player. One of the reasons it is good is that it normalizes for eras and ballparks - it doesn’t matter if the player played in the dead-ball era or if he plays for the Rockies or Red Sox. One drawback is that it is primarily an offensive measurement (there is a version for pitchers, though), though there are some basic adjustments for defense (good catchers are harder to find than good infielders.)

Here is a nice explanation if you want to dig into it further.

ETA: From the Horses Mouth

There’s no one measure. A smart baseball fan looks with honesty at all the evidence.

OPS is just your on base percentage plus your slugging percentage. If your on base percentage is .400 and your slugging percentage is .500, your OPS is .900. I assume you know what on base percentage and slugging percentage are, since those statistics are a hundred and twenty years old or so.

OPS is thus a quick shorthand to describe how good a player is at getting on base AND hitting for power. It is, however, not perfect; a player with an OPS of .800 is not necessarily better than a player with an OPS of .770. Perhapos the .770 guy is a Gold Glove catcher and the .800 guy is a fat DH. Perhaps the guy with the .800 OPS played in Cincinnati, where more runs are scored, while .770 guy played in Shea Stadium, where they are much harder to come by. Perhaps .800 OPS guy played in 1999, when .800 would have been just a little bit better than average, while .770 man played in 1968, when .770 would have been a LOT better. Or maybe .770 guy had a .400 on base p[ercentage and a .370 slugging, while .800 man was .290 OBP, .510 SLG; on base percentage is actually a lot more important, so being SLG-heavy is bad. Also, it doesn’t account for basestealing.

If you want a slightly more refined number, you go to OPS+, which is a number that says “Okay; suppose an average ballplayer in this league and playing in this park is 100. What is this guy?” So an OPS+ of 115 means your OPS is 15% better than the average dude in that league and park. Again, though, it doesn’t account for one guy being a Gold Glover and the other being Frank Thomas, or maybe one guy played 162 games and the other missed 45 games.

Another useful one is Runs Created. This basically just calculates “alright, when you take everything this dude did - hits, triples, homers, walks, getting caught stealing - how many runs did he actually make happen?” LAst year alex Rodriguez creasted 166 runs, a staggering number. You can then use a really cool number, RC/27, which is how many runs you created for every 27 outs you use. That answers a cool question; if everyone on the team hit like this guy, how many runs would that team score? For Alex Rodriguez, his 2007 RC/27 was 10.4, which is astounding; obviously a team that scored 10.4 runs a game would be almost invincible. Even if you had the worst pitching in baseball you’d win the pennant running away.

Now, you can get into some pretty hefty stats that try to work this out. VORP3 is a real egghead stat, “Value Over a Replacement Player,” which pretty much tries to answer the question; “If this guy had died in spring training and been replaced with the average schlub you could pick up for the cheap off someone’s bench or AAA, what would the difference be in runs?” All-Stars will generate 60, 70, 80 runs in VORP; guys close to zero, well, you have to ask why they’re in the majors. Calculating VORP is really complex. So is EQA, Win Shares, blah blah blah. The smart thing to do is take all the numbers and sift through them and ask honest questions about what they might mean.

Certainly, gonzomax is just going with “Average, homers and RBIs” in that thread because those are the things Ozzie Smith did worst and he just wants to prove Ozzie Smith isn’t a Hall of Famer, but given the complexity of baseball it’s quite obvious that you can’t rely ona few counting stats. To take a really extreme example, Tom Seaver only batted .154 in his career with 12 homers and 86 RBI. But if I told you that meant he wasn’t a Hall of Famer you’d laugh at me and say “yeah, but what about his PITCHING stats?”

You did ask for an answer, though, and the first number I usually go to is VORP, because it’s a nice, all-inclusive number; however, there’s still a lot to argue about.

If I’m just looking at a current player during the season, I generally just look at OPS. It’s not perfect, but it gives a good idea. The other stats are neat and measure all sorts of things, but they are also harder to figure in your head!

You can go crazy finding stats that honestly evaluate a player. Billy Beane Oakland GM does not have the cash to compete for stars. His evaluation system is different and very successful. He stresses OBP (on base percentage) and total bases on offense. He sees a successful player as one who is efficient. get on base,move along and score.
The reverse is true for pitching. he wants a pitcher who predominately throws strikes . He pays particular attention to first pitch strikes. You are much more adept at achieving outs if you start with a strike. I am sure Bonds with his incredible OBP and run scoring would have been peachy if he could afford him.
Home run hitters that strike out a lot or hit a lot of them with empty bases are devalued.
Batting average,rbis and homeruns have always been great comparison stats.
If you reject batting average ,slugging perc. and hrs and rbis ,you would be mistaken. You will win more if your team gets on base. Batting average does at least partly give that to you. RBIs can not be skipped. If runs win, the guy driving them in is valuable. Runs scored is a good stat.

What about the amount of drugs ingested?

What are good defensive stats? Assists/9? Fielding %? Is there any measure that takes into account everything?

IIRC, one of the reasons you rejected many of the defensive stats for Ozzie Smith was because he played on turf, so the hops were easier to field. Don’t such variables also effect batting average (I recall that in Oakland, at least back in the day, they had a huge foul territory, which is why it was so rare for an Oakland A to be up near the lead in batting average; Carney Lansford was a rare exception)? And don’t RBI and Runs have at least as much to do with a player’s teammates then with their own performance (after all, you can’t score unless someone else knocks you in, and you can’t drive in as many runs when nobody else is on base)?

Given these issues, aren’t the “traditional” stats more outdated and less informative then something like OPS+? I’m not talking about team stats (yes, clearly a team that score lots of runs is going to be successful), but how we judge an individual performer. Is there a way to distinguish a player’s performance from the way his team performs?

Not easily. RBIs, for instance, are dependent on how good the teammates batting in front of you are at getting on base.

There is no one statistic that’s best. VORP is useful as a crude measurement, but it’s really not much better than any of a dozen other measurements. Baseball statheads pick and choose which measurement they like best, and use different measurements in order to back up their points. Despite their pretending to be mathematically rigorous, they are usually filled with sloppy reasoning and bad assumptions.

In addition, most statistical measurements are of limited use in predicting a player’s performance, which is what baseball general managers are interested in.

You have to look at multiple measurements in order to detect a trend instead of picking one and going by it.

Why are you not a professional sportswriter? Seriously, you write about baseball more clearly and intelligently than 90% of the people who get paid to write about baseball.

Because for the most part, scientific analysis of players’ performance doesn’t sell as well as those colorful gut feelings random commentators like Joe Morgan have about the game. It’s the entertainment biz, yo.

This is not actually true, although its an understandable misconception. Beane doesn’t stress OBP and total bases in and of themselves. He actually is more interested in obtaining competitive advantage through the exploitations of market inefficiencies - wherever they lie. When he started as Oakland’s GM, the market inefficiencies were connected to walks and OBP more than anything else. Statistics demonstrated that OBP had value X. But most other teams hadn’t recognized that value, so they weren’t willing to pay for it at a rate equal to its value. So Beane might look at two players: one hits for a bit more power, has a higher batting average, and “looks good” as a hitter; the other has more patience and a higher OBP. When Beane started, Player #1 would be much more expensive on the open market, because teams valued his skills more than those of Player #2 - even though their actual value was equal.

So Beane pursued Player #2 - not because he’s obsessed with OBP and walks per se, but because Player #2 could be acquired for less money than his actual value should have demanded.

Now that other teams are catching up, and recognizing to value of OBP, the monetary cost associated with signing high-OBP players is increasing. So Beane is looking for other market inefficiences. Interestingly enough, given the conversation in the other thread, for a while he seemed to be trying to exploit the widespread misunderstanding of the value of defensive play. Lots of people - fans and front office people alike - don’t really grasp that a run saved is the same as a solo home run - they assign more value to hitting than to defense, without regard for how the two things actually affect the game. So good defensive players are cheaper than good hitters.

Only because other teams overpay for those sorts of players. If the baseball sea changed, and teams started overpaying for guys who walk a lot, Bilyl Beane would be first in line to sign the power hitters who strike out a lot.

Disagree across the board. All of these stats are flawed, in some cases terribly, and instinctively you know this even if you won’t admit it:

Batting Average: Come on, you already made this argument while attempting to devalue Ozzie Smith’s defensive ability. Coors Field is huge, relatively speaking; outfielders get to fewer fly balls. Batting average for a guy playing half his home games at Coors is going to be elevated. Kaz Matsui hit .288 at Coors last year, which looks pretty darn good. He hit .272 in his first season at Shea Stadium, which was considered tremendously disappointing. But given the difference in ballparks (and other factors), Matsui’s 2007 and his rookie season were actually basically identical, in terms of real production.

Then, of course, there’s the fact that batting average doesn’t take into account the variety of outcomes that can happen when a player comes to the plate.

Classic example: In 2000, Neifi Perez hit .287 for those same Rockies. But he didn’t walk and he had no power apart from that given him by Coors, so that .287 was completely meaningless; Neifi Perez was a well-below average hitter in 2000 (and throughout his career). In that same 2000 season, Edgar Renteria hit “only” .278. He had 16 home runs to Perez’s 10, and 76 RBI to Perez’s 70 RBI. Based on your three stats, you’d say that their performances were roughly equal, right?

But of course, they weren’t. Edgar Renteria was a pretty useful offensive player that year, while Neifi Perez was one of the two or three worst hitting shortstops in baseball (Pat Meares, who hit .240 for the Pirates that year, was on about the same level as Perez as an offensive player.

And that’s the problem with relying on batting average, home runs, and RBI to draw conclusions about baseball: it leads you to believe that Neifi Perez was about as a good an offensive player as Edgar Renteria in 2000.

First of all RealityChuck is 100% right. Don’t rely on one stat; look at them all and see the big trend.

On another issue, RealityChuck, you’re 100% wrong, or maybe it’s 50% right; you say baseball staheads pick and choose the stats they want to use to support their arguments. Actually, MOST baseball fans do that, and I would submit to you that statheads might be a bit less prone to doing it. I’m an avowed stathead but I’m not going to sit here and tell you something I know isn’t true by leaning on one number.

This is off topic, but…

There are a lot of different kinds of sportswriting. I’d divide it into three kinds:

  1. **The guys who write for your local paper ** tend to be pretty sketchy in the analysis, but for the most part that is not their job; their job is to interact with the team’s staff and players and extract inside information to turn that into interesting columns. I agree a lot of its is tripe (it depends on which sportswriter you’re talking about; Richard Griffin, who writes for the Toronto Star, is awful) but it’s a skill set I don’t possess. I have no journalism background, don’t know anyone in the major leagues and thus have no contacts, so on and so on. I’m 36 and have a family to support and a good job as it is so I ain’t goin’ back to school to become a journalist and make peanuts for fifteen years while I do the High School SportsBeat for the Shittsburgh Ass-Tribune.

  2. Famous ex-players and managers and such get paid to write because they’re famous ex-players and such. Obviously, most of their stuff (a la Joe Morgan) is terrible. Doug Glanville has written some smart material, though even in his essays you can tell he’s not an experienced journalist. Since I am not a famous ex-player or manager, that’s out.

  3. Analysts, like Bill James, Will Carroll et al., are in my category… but they’re better than I am, so why would anyone read my stuff when they could read James, Keith Law and Baseball Prospectus? You can even get a lot of that in the traditional media (Law writes for ESPN.) It’ll be a frosty day in hell before I’m half the baseball writer some of those guys are.

BTW, Keith, if you’re a Doper and are out there, I was right about the Lopez-for-Arnold trade.

I like writing about baseball for fun. Hopefully people enjoy it here. That’s enough for me.

For offense, my favorite stat is probably WARP. It is similar to VORP, but it includes defense. It is also a nice handy way to see how a move would affect a team. Upgrade from a 3 WARP player to a 5 and you will gain two wins in the standings. The main downside is you can’t really watch a game and no what a player’s WARP (or VORP) is.

As for pitching, the key to consider things the pitcher can control. So you want to look at strikeout rate, walk rate, and home run allowed rate. If you want a fourth factor add ground ball to fly ball ratio. Everything else factors in issues that a pitcher can’t control. He can’t cause his team to score more runs (Wins), have his fielders make plays (ERA, OBA), or make sure he is in the game in save situations. Those numbers will thus fluctuate year to year. By looking at the core pitching numbers, you will get a better idea of how good a pitcher actually is.

Defense still is significantly behind the other two in how we measure value, but we are getting closer. MLB has started measuring the trajectory of balls hit and eventually we will be able to measure just how many balls any player can get to. For now, there are a number of defensive measurements available and it is best just to take the consensus. Range factor isn’t a bad number if you want something simple, but it is highly dependent on a player’s environment. Fielding percentage is a nearly useless number. It is based on the inconsistent opinion of scorers and does not take into account how many extra plays a player makes.

Defense is difficult to measure. That ia why I do not blindly accept the new tools as anything other than tools. Baseballs travel in a 3 dimensional universe with spin and wind and many variables acting on them. Almost every SS is a terrific defensive ball player. Otherwise he would play first base.
Omar Vizquel fans claim he was as good defensively as Smith. They are not crazy but if you watched him play daily you might believe it.
When you see a play and say no other SS could have made it ,that is opinion. I have seen lots of incredible SS plays in my time. They would feel insulted to suggest Ozzie was so much better that he belongs in the hall and they do not.
Defense does not give itself up to numerical analysis like offense does. It clearly has a opinion as a component. Which is the reason gold gloves are dismissed as a political award.

BABIP is also proving fairly useful in understanding year to year progressions of a pitcher’s career. While luck can help a pitcher over the course of a year, BABIP tends to give a pretty good idea of whether or not a pitcher is poised to plummet to earth the following year. As Rob Neyer mentioned in a chat earlier today, only excellent pitchers and knuckleballers tend to be immune to the forces of BABIP.

Defense is clearly the most fertile future frontier of understanding, but what exists now is a heck of a lot better than what used to be the only options for measurement. And, what exactly is an incredible play? Not to pick on the defensively maligned Derek Jeter too much, but he’s pretty clearly had his issues at shortstop over the years. He may make a diving snag to get a grounder and get hailed for making a great play, but a shortstop with much better range like Troy Tulowitzki, probably gets there, makes a generic back handed stab, and easily throws the runner out.

And Cal Ripken would have been playing ten feet deeper than either guy and likely wouldn’t even have had to backhand it.

I’ll be the first person to admit that defensive performance is hard to measure, and every measurement seems to have a different opinion. Baseball Prospectus asserts that Steve Garvey was about an average first baseman; the Win Shares system says he was fantastic; Linear Weights claims he was indescribably awful. The methods even differ in terms of how much weight they assign to the importance of defense; to continue using first basemen as an example, some first basemen in FRAA can be very valuable, while in Win Shares there isn’t very much difference between the good and the bad.

But that’s why it’s important to look at the totality of the evidence. When you examine Ozzie Smith, to reuse the topic of the other thread, almost ALL the methodologies place him at #1, and many of them by Wayne Gretzkyish margins. His range numbers are superficially sensational - they are, in fact, so astounding that you’re almost forced to assume it’s partially an illusion of contesxt or else you’d have to conclude he was a greater player than Mickey Mantle. His FRAA is the highest of any player at any position I’m aware of, his Win Shares are much higher than any contemporary, and (and you have to consider this) certainly everyone who saw him said he was amazing, and to my eyes he was as technically proficient an infielder as I have ever seen. I’m forced by the weight of evidence to conclude that Smith was indeed a player of truly awesome defensive skill; there just isn’t any evidence to the contrary.

All the defensive systems ,which we agree are flawed by an inability to insure common measurement and still heavily weighted by opinion, are in agreement about Ozzie. That is proof or is it fulfilling a preconception about Smiths fielding. It someone who hates Ozzie and thought he was over rated did the measurements,would they come out the same?
I am old and have seen a lot of great fielders in my time. i did not see Ozzie every day since I usually watched American League player. Very little artificial turf.
But to tell me these defensive greats are way behind Ozzie will not fly. I have seen many incredible plays. Shortstops digging the ball out past second base. Not buying Ozzie was that much better.

I would think that would be the case only if a person crafted a metric with the express purpose of making Ozzie look good or if their basis for what it told them was done by using Ozzie as a test subject and “knowing” where he should end up. I doubt that happened with any of them and certainly not with all of them. The only logical conclusion is that if every system tells you the same thing, it’s time to put some weight behind that conclusion.

I’ve seen Ryan Theriot make some really nice diving stops. I’ve seen a couple of plays he made behind second base. I also know that a blind squirrel can find a nut sometimes. Ozzie was great not because of the number of flashy, memorable plays he made (and there were many). He was great because his ability and knowledge helped him make a difficult play look easy and because of the sheer volume of plays he made that every other shortstop didn;t make.

Batting average =hits divided by at bats. RBIs =runs knocked in. Runs scored=number of times you crossed home plate .Stolen bases,on base percentage .total bases. These are all measurable. As soon as you measure defense opinion slips in. Guillen had an errorless season last year. He got hometown calls on disputable calls. Everybody does. Any defensive measurement is suspect due to it is someones opinion whether he should have caught the ball or not. I can accept defensive numbers as useful tools but not seminal measurements.
I ran into a site a couple days ago where the people were debating whether Omar Vizquel was a better defensive player than Ozzie. That argument can be made because defense is rated by opinion. Fans watching Omar were impressed by his play day after day. Were they wrong? You will say yes. I will say the difference between really good defensive shortstops is minimal.
You can to some degree compare baseball through the ages using offensive numbers. Some corrections due to dead ball age and the institution of relievers.