Can statistics be used to show probable PED users in MLB?

Given the large number of MLB seasons, players, and at bats can we use that data to say with any statistical confidence who were PED users? For example, totally making numbers up, typically a player’s offensive “output” peaks at 28 and reduces by 7% per year until 38 where it drops by 14% per year. Players several standard deviations over, ie got significantly better in their 30s, are probably users.

Although the question is about statistics, I think the folks in the Game Room may have more info at hand to help answer this question.

Colibri
General Questions Moderator

I am almost positive a computer model could be used to at the very least “red flag” anomalies that indicator further investigation; I did a quick Google search and one poster even suggested this is a “Great White Whale” of baseball saber metrics.

Im guessing it hasn’t been that hotly pursued for obvious legal reasons, also lets say you discover a 36 year old player who, after averaging 9 home runs a year for his entire career, knocks 29, 27, and 30 from 2000-2002, and then goes to 6 in 2003 when MLB enacted PED testing and never hits another 10 homers until he retires. YET, there is absolutely ZERO evidence said player tested positive for PEDs.

Even though you know that he knows that you know he knows, is it really fair to publish those statistics and say he may have used PEDs those three years?

A topic for another day, but this is why I dont give two shits about whether players used steroids or not, only the ones that caught seem to get punished.

Very unlikely. While the database of statistics in baseball is huge, you also have to consider that PEDs of some sort have been prevalent in the MLB for decades. Baseball statisticians maintain that the largest impact the recent testings/bannings/etc. are going to have aren’t from the Barry Bonds-esque bulking up drugs, but the rampant and pervasive amphetamine use that’s been prevalent in baseball since the 70s. We’ll be able to see season-wide trends going forward on offense (fewer games played per season, fewer plate appearances, etc.). Yeah, it’s easy to see aberrations like Brady Anderson’s 50 HR season, Ken Caminiti’s MVP year, or Bonds’ ridiculous 5 year stretch - but do players who used greenies every day just to make the roster show up in such analyses?

I once compared the career stats of Barry Bonds, Henry Aaron, and Willie Mays. All three tracked pretty well through about age 34. At that point, Aaron and Mays started a downward trend until the end of their career. Bonds, on the other hand, increased his power by a quantum leap, and maintained it for several years. To me, it was pretty obvious what year Bonds started to use PEDs. Tough to put a p-value on it, though.

For players who use PEDs for their entire career, there wouldn’t be an obvious jump in performance.

If you accept weight, or specifically muscle mass, as a statistic, then sure you can. With Bonds or other heavy HGH users, skull size is a good indicator too.

I think you could make a compelling case, as with the Bonds example. I’d be interested to see how Ken Griffey Jr. tracks compared to those guys, too. Bret Boone is another guy who hit about a dozen HRs a year until he turned 30 and suddenly he’s hitting 35+ HRs for a few years.

But I still would not trust statistics to be 100% accurate as an indicator.

No you can’t.

Look baseball has no shortage of stats geeks. Plenty of very smart and obsessive people have tried to answer this questions and no one has gotten anywhere.

There are a lot issues that make it really difficult:

The signal to noise ratio is way too low. People haven’t really been able to show that steroids have impacted the game general, let alone how it affects individual players. For example, offense is down from the steroid era mostly because strikeout rates are way up due to higher velocities. How do you tie that to steroids?

We do not know who did what when. Given the hostile response anyone who admits steroid use gets, we are unlikely to get this information

The group of players who we know have done steroids are all over the map. Yes, Bonds hit a lot of home-runs. Marvin Benard did not. There doesn’t seem to be any correlation between the players who we know did steroids

We do not know what we are looking for. Do they increase longevity or decrease it? They cause injuries or heal them? Increase power? Speed?

Baseball is weird. There have been lots of strange career paths long before steroids was a concern. Guys having odds seasons don’t necessarily mean they were enhanced

Steroids come in many types and players used them to different extents. We can’t expect a universal effect.

If you ask “***Should ***statistics be used to show probable PED users in MLB?”, that may be up for debate; but if the question is “Can they?”, the answer is unequivocally “yes”.

Barry Bonds is the best example. He never failed a drug test, and there is no direct evidence that he ever took PEDs, but everybody pretty much accepts that he did (myself and most other Giants fans included); the only ‘argument’ is how much, if at all, we should care.

Theoretically, perhaps. If those data even exist, it would likely be difficult, if not impossible, for sabermetricians to access them. It’s possible that the Giants (or other teams) did routine measurement of their players’ biometrics in those years, but even if they did, they may not have been standardized, and the data may not have been kept, either.

Or you could just, you know, *look *at 'em.

Well, yes, but the “eye test” doesn’t address the OP, which asked, “can we use that data to say with any statistical confidence who were PED users?”

Yes, obviously the Barry Bonds who hit 73 home runs in 2001 looked like a completely different (and much larger) person than the Barry Bonds of the early 1990s (when it seems unlikely that he was using PEDs), but we’re talking about statistics here.

If you’re going to use changes in muscle mass to argue that certain players were likely PED users, you need the data to answer that question “with any statistical confidence”.

Idle speculation. Bonds is completely innocent.

Haters gonna hate.

Right? Right?

Well if you check out Aaron’s stats, three of his eight 40 homer seasons after 34 with his career high of 47 at age 37. He hit 39 homers in 1973 at age 39, the year before he broke Ruth’s record. Mays did kinda fall off a cliff so that part’s right.

One of the best posts I’ve ever seen on this forum. From Hack Wilson to Roger Maris, there were plenty of ballplayers way before the steroid era whose performance suddenly jolted up mid-career before falling way back. Davey Johnson hit only 78 homers in seven years–then in 1973 he hit 40. Two years later, he was out of baseball. He certainly didn’t use steroids, but today everybody would instantly red flag him.

Well, who knows, maybe Davey Johnson was an early adopter. Statistics alone won’t tell us.

Did Brady Anderson juice to hit 50 homers? Here’s the problem with that theory; if steroids gave him 50-homer power, why did he stop taking them? As a regular player from 1992 to 1995 he averaged 15 homers a year. Then he hit 50. Then he went back down to averaging about 20 homers a year. Did he decide “wow, steroids sure worked but I don’t like hitting home runs”? Players who just chug along hitting 35-40 home runs a year (or, really, any number of home runs a year) aren’t normal, even amongst power hitters; guys like Mike Schmidt who basically had the same year over and over for a decade or more are rare exceptions.

In some cases the evidence is overwhelming as with Bonds, but the “just look at them” theory being advanced by ElvisLives doesn’t help you detect Rafael Palmiero, does it? He looked ordinary his whole career.

Nothing is foolproof, no. The pitchers who started using mid-career didn’t look like it either, so pitchers were never the butt of jokes for it while it was going on. Mike Piazza did look like he was using heavily, by comparison, but there are Dopers who hotly deny it.

To find it by stats, you’d have to have a baseline non-using set of data, then a known-using set of data, and run through the various stats tests to determine if the difference was significant and due to that effect. Sometimes a player has a step change in performance simply because he’s worked out a flaw in his swing or something, and that couldn’t be distinguished from other causes.

So no.

I remember Brady’s big breakout year, but not the details. Was it a contract year? Maybe he wanted the extra pop for a year, but not the long term ramifications of steroids.

It doesn’t look like it. According to his page on Baseball-Reference.com, he was a free agent after the '97 season (he hit his 50 HRs in '96), and re-signed with Baltimore.

I dunno. When I was in Austin in 2004 and 2005, the local sports talk guy was joking about, if you were going to build a stereotypical ‘steroid pitcher’, he’d look a lot like Roger Clemens was looking that year. 18-4, 2.98 ERA (145 ERA+), 218 SO for '04; 13-8, 1.87 ERA, (226 ERA+!), 185 SO for '05. At age 41 and 42.

So it wasn’t just hitters under scrutiny at the time.