Hitting streaks and probability

Here’s one for the probability experts that’s always stumped me:

One of the most famous numbers in baseball is 56 - the major league record for consecutive game hitting streak, set by Joe DiMaggio. The next highest is 44 (by Pete Rose, if I recall correctly).

However, it’s rarely, if ever, that a season has ended without a hitter amassing 200 hits - an average of more than one per game.

Now, obviously there’s some statistical variation…there will be games that a good hitter (say, Wade Boggs or Tony Gwynn in their primes) will get no hits, and some in which they get two or three.

But why should hitting streaks greater than 30 games in length be so rare? If we accept as a given that at least one hitter will get 200 hits in a year, assuming he plays each of his team’s games (162), and assuming he gets the same number of hittable at-bats each game (I suppose that this assumption could explain why in real life, it doesn’t happen), what’s the statistical probability of that hitter stringing together a hitting streak of, say, 100 games in a row? 80? 60 (yes, I know Joe D. did that in the minors)?

Basically, does raw probability support the unlikelihood of long hitting streaks, or does it make more sense to attribute the rarity to things such as change in opponent strategy, increase in pressure, etc.?


Chaim Mattis Keller
ckeller@kozmo.com

“Sherlock Holmes once said that once you have eliminated the
impossible, whatever remains, however improbable, must be
the answer. I, however, do not like to eliminate the impossible.
The impossible often has a kind of integrity to it that the merely improbable lacks.”
– Douglas Adams’s Dirk Gently, Holistic Detective

Let’s take asimplified example. Suppose a hitter bats .350 and has 4 at bats per game. Assume that all at bats are independent. Then his chance of going hitless in a game would be .650^4 = .18. His chance of getting at least one hit in the game is .82. Then, the chance of a 50 game hiting streak, starting today, is .82^50 = 1 in 20,000.

For a .300 hitter, the chance would be much lower, only 1 in 916,000.

In reality, all at bats are not independent. There will be days when the hitter is not feeling well or when he is facing a particularly tough pitcher. Also, there will be days when he gets fewer than 4 at bats. Therefore, the odds are even worse than those shown above.

On the other hand, it may be that when a player gets into a hitting streak, he may be able to try extra-hard to get a hit each game and keep it going.

Actuaries rule!

Also, when there is a high-average hitter, and one who is on a hot streak, there is a better than normal chance that the batter will be walked, which doesn’t count as an at-bat for the hitter.

IIRC, toward the end when DiMaggio worked his magic 56, pitchers made sure to pitch to him, lest they rob him of his fair shots by walking. Same thing with McGuire & Sosa when they were shooting for 62 HR…

But maybe you’ve got someone hitting 30 games in a row or has hit 45 HR by August, so you’re a little more likely to walk 'em. It’s not like an 31 game streak or 46 HR are anything special, compared to your ERA… As a pitcher, you don’t care until they get close, so you’re really going to try to ruin it.