Perfect games in baseball.

Yesterday, Mark Buerhle pitched the sixteenth perfect game since 1900. For your perusal, here is a list of all of them, along with the date:

1904-05-05 Cy Young
1908-10-02 Addie Joss
1922-04-30 Charlie Robertson
1956-10-08 Don Larsen
1964-06-21 Jim Bunning
1965-09-09 Sandy Koufax
1968-05-08 Catfish Hunter
1981-05-15 Len Barker
1984-09-30 Mike Witt
1988-09-16 Tom Browning
1991-07-28 Dennis Martínez
1994-07-28 Kenny Rogers
1998-05-17 David Wells
1999-07-18 David Cone
2004-05-18 Randy Johnson
2009-07-23 Mark Buehrle

Anyway, I am not strong in statistics, but it seems that perfect games are becoming more common. Is there a way to prove this statistically? If so, what is it? And finally, why might this be the case?

Moving to The Game Room from GQ.

Colibri
General Questions Moderator

Everyone watches the World Series. Don Larsen showed them all how it’s done.

There’s always a push and pull between the strength of baseball’s offense and defense over time. Changes in the rules (mound height) and coaching strategies (pitching rotation) will alter the balance. My first guess for answering your question would be to just plot all pitchers’ ERAs (or ERA-like stats) over the history of baseball. The lower the average ERA, the better the pitching that year. But you’d want to email Nate Silver for the best answer. This blog post measures each of these guys’ probability of a perfect game over their career. Me, I’d want to pitch in mid-May or late July for the best chance…

Another obvious point (that just occurred to me). There are twice as many games played each year today than before 1960. But while I’m here, let me relate a story I remember from Catfish Hunter’s book. When Catfish Hunter attended the 1987 All-Star game as part of a traveling Old-Timer’s game taking place that season, it was his first return to Oakland-Alameda since his retirement eight years earlier. The All-Star game was a light-hitting affair (2-0 Nationals in 13), and afterward, some reporter asked Hunter something like “this game began about the same time of day as your perfect game, didn’t it?”

Hunter knew where the reporter was going, but answered innocently. “No, that was 6 PM and this was 6:30” (or vice-versa. Or thereabouts).

“Oh, still, that’s pretty close. Do you think the batters had trouble seeing the ball at that hour?”

“Well, maybe they did. But I saw it well enough to get three hits, including a home run”.

Well, there is a longer season than at some times, and more teams, which does make for more overall major league games. Perfect games are such a rarity, it’s hard to see what is just an anomaly and which is statistically significant. There are around 24650 games per current decade(depending on playoff series), with around 2 perfect games. It’s a pretty small sample to detect meaningful trends on.

Looking at the no hitters by decade, it does seem to be generally increasing,

but much more gradually, andquite possibly about in line with the games played per time period . It would be interesting to see number of no-hitters/games played by decade to see the trend, on a larger sample size. It should be relevant because perfect game is a subset of no hitter.

The more games per year per team thing is, of course, one of those obvious things that slipped my mind. Anyway, here is a rough estimate of the number of games between each game on the original list.

22,136
42,925
10,280
2,328
3,762
25,766
16,181
5,743
6,585
8,208
3,267
11,340
10,596

As far as I can tell, that does even things out a bit. Provided of course I didn’t make a mistake.

Also, sorry for originally posting this where it did not belong.

There have been 16 perfect games in MLB since 1900. This site(which did the math) says there’ve been a total of 174,206 MLB games since 1900. That was before Buerhle’s game, so the stats are for 15 perfect games.

Statistically that puts the odds of any game being a pefect game at approximately 1 in 11,614. According to the National Weather Service, you have a 1 in 5,000 chance of being struck by lightning at some point in your life.

In other words, a perfect game is a random event. We could see another one tomorrow, in 11,614 games (that’s about the length between Cone, Johnson and Buerhle’s perfect games) or never.

RadicalPi, clearly your list of the approximate number of games between each of the perfect games is wrong. First of all, there are sixteen perfect games and you give only thirteen numbers of games for the times between perfect games in your list, so you’ve obviously missed two numbers. Second, if I just put those numbers of games down in the gaps between perfect games, they clearly don’t match. It looks like this:

1904-05-05 Cy Young
22,136
1908-10-02 Addie Joss
42,925
1922-04-30 Charlie Robertson
10,280
1956-10-08 Don Larsen
2,328
1964-06-21 Jim Bunning
3,762
1965-09-09 Sandy Koufax
25,766
1968-05-08 Catfish Hunter
16,181
1981-05-15 Len Barker
5,743
1984-09-30 Mike Witt
6,585
1988-09-16 Tom Browning
8,208
1991-07-28 Dennis Martínez
3,267
1994-07-28 Kenny Rogers
11,340
1998-05-17 David Wells
10,596
1999-07-18 David Cone
missing number
2004-05-18 Randy Johnson
missing number
2009-07-23 Mark Buehrle

So let me try to fix this. I presume that you failed to calculate the number of games between Young and Joss, so let me assume you skipped a number there. We get this if we move everything from there on down one space:

1904-05-05 Cy Young
missing number
1908-10-02 Addie Joss
22,136
1922-04-30 Charlie Robertson
42,925
1956-10-08 Don Larsen
10,280
1964-06-21 Jim Bunning
2,328
1965-09-09 Sandy Koufax
3,762
1968-05-08 Catfish Hunter
25,766
1981-05-15 Len Barker
16,181
1984-09-30 Mike Witt
5,743
1988-09-16 Tom Browning
6,585
1991-07-28 Dennis Martínez
8,208
1994-07-28 Kenny Rogers
3,267
1998-05-17 David Wells
11,340
1999-07-18 David Cone
10,596
2004-05-18 Randy Johnson
missing number
2009-07-23 Mark Buehrle

My next guess would be that the numbers between Barker and Witt and between Witt and Browning are wrong. I’m going to guess that you just skipped Witt entirely when you counted the number of games. So drop Witt (for the purposes of figuring out what you did) and move everything from there on down one space:

1904-05-05 Cy Young
missing number
1908-10-02 Addie Joss
22,136
1922-04-30 Charlie Robertson
42,925
1956-10-08 Don Larsen
10,280
1964-06-21 Jim Bunning
2,328
1965-09-09 Sandy Koufax
3,762
1968-05-08 Catfish Hunter
25,766
1981-05-15 Len Barker
16,181 *
1988-09-16 Tom Browning
5,743
1991-07-28 Dennis Martínez
6,585
1994-07-28 Kenny Rogers
8,208
1998-05-17 David Wells
3,267
1999-07-18 David Cone
11,340
2004-05-18 Randy Johnson
10,596
2009-07-23 Mark Buehrle

*This assumes that the 1984-09-30 game by Mike Witt was missed when you counted.

That’s as close I can get from what you’ve written.

Wendall Wagner, your guesses are surprisingly accurate. Thanks for taking the time to look at my numbers. Anway, I have redone the calculations, and here is the new (and hopefully more correct) number of games between perfect games. What happened was that I more or less ignored the two perfect games that happened toward the end of the regular season, thus jumbling them together. I didn’t do this with the World Series game, since I knew it would cause problems with my estimation process. Anyway, voilà:

1904-05-05 Cy Young
5924

1908-10-02 Addie Joss
16211

1922-04-30 Charlie Robertson
42925

1956-10-08 Don Larsen
10280

1964-06-21 Jim Bunning
2328

1965-09-09 Sandy Koufax
3762

1968-05-08 Catfish Hunter
25766

1981-05-15 Len Barker
7918

1984-09-30 Mike Witt
8263

1988-09-16 Tom Browning
5743

1991-07-28 Dennis Martínez
6585

1994-07-28 Kenny Rogers
8208

1998-05-17 David Wells
3267

1999-07-18 David Cone
11340

2004-05-18 Randy Johnson
13026

2009-07-23 Mark Buehrle

Also, Kunilou, I know that perfect games are rare, but I’m not sure that that means that their distribution is random. I first wondered about their becoming more common because I noticed, of these 16, 3 are before 1950 and 13 are after, there are 4 between 1950 and 1975 and 9 after, the 1980s have 3, but the 90s have 4. Now the increase in the number of games per season definitely changes what this trend may be, I still think they’re becoming more common, but with my lack of statistical abilities, I don’t know for sure. Wolfman mentioned that there are probably too few to judge any trend accurately, and to look at no hitters per decade, if only to have something similar but more common to look at.

Also, here are the data Wolfman was wondering about. They are orded by decade, then the number of no hitters in that decade, the estimated number of games in the decade, and finally the fraction of games that are no hitters.

Decade; No hitters; Games; Fraction
1901–1910; 21; 12320; 0.0017
1911–1920; 28; 12320; 0.0023
1921–1930; 8; 12320; 0.0006
1931–1940; 10; 12320; 0.0008
1941–1950; 12; 12320; 0.001
1951–1960; 19; 12320; 0.0015
1961–1970; 35; 16614; 0.0021
1971–1980; 28; 20088; 0.0014
1981–1990; 19; 21060; 0.0009
1991–2000; 24; 22842; 0.0011
2001–2008; 13; 19440; 0.0007

According to Excel’s linear trendline function thing, no hitters are becoming less common. So, perhaps this indirectly answers the question of a trend in perfect games as well.

Great, thank you for doing the leg work. Kind of hard to get a crisp clear picture isn’t it. It seems that No-hitters are also so rare as to be hard to correlate, still a pretty random event. But it does seem on the surface to go along with the offensive defensive swings. with the 50-80 era of pitching having a higher number of them, and 80-curr. block of more offense, lauching pad ball parks and designated hitters a bit harder to throw one.

The least accomplished pitcher on the PG list, by a fair margin (although Larsen and Barker weren’t great), is Charlie Robertson, who had the only PG in White Sox history until last week.

Amazingly enough, Robertson pitched his PG in only his fourth major league start. He blew out his arm in 1924 and limped on to a 49-80 career record.

It would be a “random event” if the chances that I threw one were the same as this Buehrle guy. But I don’t play baseball!

As for the lightning analogy, if you were trying to get struck by lightning, you could probably get the odds down to about 1 in 2. Simply hang out on a large granite dome in a summer storm with a metal rod strapped to your head!

IMO, at least two of these games should be considered “perfect” for any statistical analysis.
Numbers 2 and 3.
(1) On June 23, 1917, Babe Ruth, then a pitcher with the Boston Red Sox, walked the Washington Senators’ first batter, Ray Morgan, on four straight pitches. Ruth, who had already been shouting at umpire Brick Owens about the quality of his calls, became even angrier and, in short order, was ejected. Enraged, Ruth charged Owens, swung at him, and had to be led off the field by a policeman. Ernie Shore came in to replace Ruth. Morgan was caught stealing by Sox catcher Pinch Thomas on the first pitch by Shore, who proceeded to retire the next 26 batters. All 27 outs were made while Shore was on the mound. Once recognized as a perfect game by Major League Baseball, this still counts as a combined no-hitter.

(2) On May 26, 1959, Harvey Haddix of the Pittsburgh Pirates pitched one of the greatest games in baseball history. Haddix carried a perfect game through an unprecedented 12 innings against the Milwaukee Braves, only to have it ruined when an error by third baseman Don Hoak allowed Felix Mantilla, the leadoff batter in the bottom of the 13th inning, to reach base. A sacrifice by Eddie Matthews and an intentional walk to Hank Aaron followed; the next batter, Joe Adcock, hit a home run that became a double when he passed Aaron on the bases. Haddix, and the Pirates, had lost the game 1-0. This is seen as one of the most agonizing of all baseball defeats, especially as the Pirates had 12 hits in the game but could not bring a run home. The 12 perfect innings—36 consecutive batters retired in a single game—remains a record.[35]

(3) On June 3, 1995, Pedro Martínez of the Montreal Expos had a perfect game through nine innings against the San Diego Padres. The Expos scored a run in the top of the tenth inning, but in the bottom, Martínez gave up a leadoff double to Bip Roberts, and was relieved by Mel Rojas, who retired the next three batters. Martínez was therefore the winning pitcher in a 1-0 Expos victory.[36]

A point that hasn’t been made so far:

Errors.

Errors are much, much rarer today than they used to be. To give you some idea of the difference, in 1910, the average American League team made 290 errors, about two a game. In 1920 that number had dropped to 215 errors per team, about one and a half errors per team per game. It was just below 200 in 1930, just below 150 per team in 1950, and so on. This year the average AL team has made just 56 errors, and one, Toronto, has made just 35 (in over half a season.)

Obviously, reducing the number of errors VASTLY increases the likelihood of a perfect game. In 1915 it was unusual for a team to play a game in which it did not make an error; today it’s commonplace. Far fewer errors means far fewer perfect games turned into no-hitters as the result of errors. Buehrle’s White Sox are actually relatively fumble-fingered by 2009 standards, but still are acheiving a fielding percentage than in the days of Christy Mathewson would have been ludicrously impossible. Had Buerhle put on the same performance for the 1909 White Sox - who made 247 errors, more than any two AL teams made COMBINED last year - there is a much better than even chance his teammates would have blown it with an error.

How much of the difference is an improvement in defense, and how much of it is a difference in how a game is scored?

It’s almost certainly wholly a difference in defensive skill. Anecdotal evidence of how baseball was scored in the past suggests it’s always been more or less the same as today. There’s a common belief that scorers are more generous than they used to be, but it’s not so; they’ve always been generous.

Improvements in defensive skill have been going on for a century - coaching is much better today, understanding of the components of defensive skill is better, the talent base is wider. And of course fielders have better gloves, which makes a lot of difference, and play on much more reliably even diamonds.

Errors and fielding percentage are tremendously flawed statistics in more ways than I can explain in one post, but there’s no doubt at all that fielders today make fewer out-and-out flubs. That isn’t to say Rabbit Maranville (the 20th century error leader) was a bad fielder; I mean, we’re talking about a difference of one error every forty plays. But it makes a big difference in the likelihood of perfect games taking place.

Interestingly, the Wiki page on perfect games notes that the Japanese League has had 15 perfect games thrown since 1950, which would be a much higher percentage of occurrence, as the JLs only have twelve teams.

It’s plausible that there are more perfect games in the Japanese Leauge, if we believe that League is generally inferior to American MLB [li]. As Stephen Jay Gould pointed out, as leagues get better, the gap between the best and worst players narrows. So in inferior leagues, the best players will be relatively farther ahead of their peers. So you’d expect a few more truly dominant pitchers in a lesser league, and the chances of a perfect game go up. [/li]

  • Now, I don’t know enough to say whether the JL really is inferior in this sense, but it’s at least plausible that in the 1950’s, the overall level of baseball coaching and knowledge in Japan was low compared to MLB at nearly any time.

Artificial turf has to have a positive effect on the number of errors in a baseball game.

That would seem to be a fairly easy statistical analysis.

Errors per game on artificial turf versus errors per game on real grass.