Probability and the NCAA Tourney

A mathematically impaired co-worker and I have been grappling with the question of how many possible outcomes there can be in the NCAA basketball tournament. More specifically, we wondered what the odds are, of correctly picking every single game before the tournament starts. For purposes of this example, let’s assume that the winner of each game is randomly decided, so that rankings, etc. don’t complicate it.

It seems to me that in the first round of 32 games, there are 2^32 possible combinations of winners and losers. In the second round of 16 games, there are 2^16 possible outcomes, etc., down to the final game that has 2^1 (or 2) possible outcomes. Since the person must correctly pick all of the games, the odds of picking the correct winners in each individual round should be multiplied together to reach the final calculation of the odds of picking the whole tourney correctly in advance.

My math gives me 2^32 x 2^16 x 2^8 x 2^4 * 2^2 * 2^1 or an overall chance of nailing it at 1 in 9,223,372,036,854,775,808.

Except that this number seems rediculously large-- orders of magnitude higher than the odds of winning the lottery. Is there a flaw in my formula? If so, what is the correct formula to use? Should I have saved my $5 and not gotten involved in a pool?

Actually, I think your formula is right, although a bit overcomplicated. An easier way to do it would be to remember that there are 63 games total so you have 2^63 possible permutations.

On my cheap-o calculator I get 9.22x10^18. Which looks like your answer.

As to the astronomical odds, you are also correct. Look at it this way, in the lottery you pick 6 numbers from 1 - 36, so we have 36^6 possible combinations, or 2,176,782,336 total lottery numbers.

So you are far more likely to win the Lotto than pick all 63 games correctly. Of course, you don’t have to pick them all, just more than everybody else in the pool. So your odds of winning are determined more by the number of people in the pool than by the number of games.

gEEk

There is an online contest in which contestants predict the outcome of the NCAA tourny found on CNNSI’s web site. The payout is 10 million. They have already said that no one will win as all of the 600,000 contestants have already missed predicting at least one game.

The permutation totals given are correct if you assume that everyone has a 50-50 chance of winning a particular game.

I’ve seen slightly lower odds for picking all 63 games from a math professor who assumed that an “expert” will be right 75% of the time.

Nevertheless, the orders of magnitude that you get makes the chances of getting all 63 games correct extraordinarily remote.

I doubt that CNN/SI had to pay very much for an insurance policy to pay out the $10 million.

Yeah, if you assume you’ll pick the winner 75% of the time, the odds are a lot better that you’ll pick them all correctly–about 1 in 74,325,939 ((4/3)^63). (Obviously the odds still suck, though).

BobT wrote: <i>I doubt that CNN/SI had to pay very much for an insurance policy to pay out the $10 million.</i>

The local paper had an article about a web site that would pay $10 million for a correct answer. (I think it was a different site than CNN/SI). Anyway, the article said the company covering the contest payout was charging the web site 25 cents per bet to cover the possibility that someone would win. If I’ve done the math right, then this is in effect paying out at a 40 million to 1 rate (25 cents can win 10 million). I assume that the covering company includes its profit in the odds, so it really thinks that the chances of someone actually winning is something like 50+ million to 1.

The article also said the site was making a nice profit on the contest, because it only cost them 25 cents to sign people up, but they were getting more than that on average from commissions from other sites which had banner ads on the sign-up page.

Well, the CNN/SI folks got lucky what with all the early round upsets. They’re also lucky I didn’t sign up. I picked N.C. to upset Stanford in the second round, and everyone thought I was crazy. Who’s laughting now. . . [Evil villain laugh]Ha ha ha haha[Evil villain laugh/] ha ha he he heh he. . . yeah.


Well, either you’re closing your eyes to a situation you do not wish to acknowledge or you are not aware of the power of the presence of a pool table in your community. Ya’ got trouble my friends! -
Prof. Harold Hill
Gary Conservatory
Gold Medal Class
'05

Ha ha ha - I wish most people had that good of an understanding of probability and expected payout.

Luckily, there was no entry fee, so any return you might get (i.e. betting nothing and getting $10mm) is infinite, so that’s a great bet to take. Now, if they charged a penny entry fee, then the expected value is negative and it’s no longer a good bet. Funny how that works out.

I’m not an expert in probability or gambling, but if the covering firm was charging 25 cents per bet, I would assume that that expected payout was on the order of 20 cents per bet. Thus, only if you paid more than approximately 20 cents would there be a “negative expected value”.

You’ll increase the odds if you factor out all the outcomes where a #16 seed beats a #1 seed which has yet to happen in the NCAA tourney.

As a side note, there are no ‘experts’ who can pick winners at a rate of 75%. The best sports handicappers in the world are happy to hit 55-58%. You need to be able to pick about 53% to cover the ‘vig’ that most sports books have.

Sports handicapping is about long-term win rates that are only a few percentage points above pure chance. If anyone could pick even 65% winners he’d break every sportsbook in the nation in a very short time.

I read about this contest in ESPN magazine. They said the site was “Sandbox.com” The odds of winning were said to be 10 quintillion to 1, so I think your numbers look correct.

It depends on whose payout you’re looking at. I was talking about the expected payout of a person entering the contest, not CNN or the insurance company.

Expected value can be defined simply as what you would end up with (net) if you ran the contest a (very) large number of times.

I’m going to cheat and round the actual odds up to an even number.

Let’s say we play the NCAA pool 10x10^18 times (I’ll call this X) - in that case, then by definition, the winning entry will be drawn one time. So when these X trials are over, I’m going to get $10 million. With no entry fee, I am paying 0, but winning $10 million, so my expected value is positive (I’m receiving more than I am paying).

If CNN charged a penny entry fee, then over the life of our experiment, I will end up paying 10x10^16 dollars. This is a lot greater than what I will bring in (10x10^7 dollars), so my expected value is now negative.

It’s calculations like this that make the lottery such a bad bet - when you realize just how bad of a bet it is, you don’t play it. That’s why a lot of people call the lottery a “tax on the stupid”, because only stupid people will take those odds with the honest expectation of eventually winning.

As far as CNN goes - just holding the contest gives them a negative expected value within the confines of this experiment, and then paying a quarter in insurance for each entry just makes the expectations more negative. The reason that they’re willing to do it becomes an economic reason - the outside revenue (merchandise etc) that they bring from an average participant in the contest will outweigh the cost (payout and insurance) of that person playing the tournament. They don’t need to charge an entry fee because these outside sales are covering the cost of the contest.

It’s a lot more complicated for the insurance company, but suffice to say, they’re in the business of making calculations like the one I did above every day and with a lot more sophistication. The $0.25 is a number that they came up with that, when combined with all of their other policies and businesses, will help to eventually give the company a positive expected value in the long run.

Sam Stone wrote <<As a side note, there are no ‘experts’ who can pick winners at a rate of 75%. The best sports handicappers in the world are happy to hit 55-58%. You need to be able to pick about 53% to cover the ‘vig’ that most sports books have.
Sports handicapping is about long-term win rates that are only a few percentage points above pure chance. If anyone could pick even 65% winners he’d break every sportsbook in the nation in a very short time.>>

I’m assuming you are speaking about betting against the spread. That’s not what we’re doing here. To fill out a tournament bracket, one only needs to pick winners and losers, I would imagine that were one to pick the higher seeded team in every game every year, that they would pick more than 53-58% of the winnners. I picked 34 out of 48 games thus far, about 71%. Had I picked all favorites for each game I would have picked 36 out of 48, 75%. The fellow in first place in the ESPN challenge picked 43 games, 90%.

My understanding of the seeding process is that it looks at several factors all based on past performance including win-loss record, strength of schedule, etc. What they do not take into account is how the strengths and weaknesses of two paired teams (or for later rounds) potential pairings stack up (it was possible that Arizona, a #1 seed would meet up with LSU a #4 seed, despite the fact that LSU cleaned Arizona’s clock earlier this season). I would imagine that a careful analysis of this can result in th 90% score.

It’s always entertaining to compare my own picks vs. a “favorites only” bracket. Some years I do better, this year I’ve done worse.

More on betting against the spread. My understanding is that for point spread bets, the spread is enough to make the action on both teams identical (for every dollar bet on one team, a dollar is bet on the other team). The house takes the vig from every bet so it has zero risk. It pays of the winners with the losers money and collects a service fee.

It’s a lot like buying futures or options (the derivatives market isn’t called the casino economy for nothing).

Now, my question is, how efficient is this market? If my memory is correct, in the John Elway led Bronco Superbowl losses, the Broncos were favorites in the point spread, despite the fact that most analysts thought they were overmatched and would get beat. If this is the case it would indicate that the Denver fans, out of irrational loyalty, were placing bets on their team. The expected value of a bet on the NFC (I’m thinking of the Redskin game specifically) will result in a positive expected value proposition for the bettor.

In horse race betting there is a strategy to bet on horses that go off at higher payouts than the initial handicapped odds, with the assumption being that those odds best represent the true odds i.e. a horse with initial odds of 2-1 would win that race one out of three times (given that the track needs its cut this isn’t quite true). If it goes off at 5-1, then the expected value of the bet is positive. (probability of win = 1/3, return Payoff of win = $5, probability of loss = 2/3, payoff of loss -$1 = E(Bet) = 1/35+2/3-1=$1.

Is this how professional point spread bettors make their money, or is the real money made by folks with info unavailable to the general public? (DeNiro in Casino)

A correction, I think I got the math wrong. The payout is $4 (cash in = 1, cash out = $5, net cash = $4 for win, net cash still -1 for loss) so E(Bet) = .67. E(Bet) if the horse goes off at 2-1 is -$.33.

Also should be noted that the odds shift as the house tries to balance the bets kinda like changing the point spread. I think the difference is that at the track the horse pay based on final odds no matter when the bet was placed, in sports gambling, you can get your bet down at a spread which can later change, but you get that spread.

Sloth: You’re right, I was speaking of betting against the spread.

As for how the handicapping is originally done - the bookie or oddsmaker will set the line based on his analysis of both teams’ chances of winning. There is some art to this, but in general these guys are very accurate. A lot of sports books and private bookies simply use the lines generated by the big ones. Roxy Roxborough has perhaps the biggest service that provides these lines.

The potential for profit comes from two places - first, by being a better handicapper for that particular game than the sports book. This is possible because the professional sports bettor is only looking at a small handful of games, whereas the sports book has to come up with lines on all of them and can’t necessarily put in the same amount of effort. This is one reason why a lot of the pros specialize in college basketball and football - there are a LOT of games, and the sports book probably didn’t analyze each one very well. By going over the best-looking matchups very carefully, you can find value.

Second, the sports book may change their line slightly from what they believe is ‘correct’ if they think there will be a big imbalance of action on one side or another. As you said, they would ideally like to balance the action to eliminate risk. So if all the money starts going to one side, they’ll move the line to punish those bettors and to get more action on the other side. But don’t take this idea too far - if the book thinks the public is wrong it’ll often just let them keep betting. In essence, the book chooses to gamble with its own money.

Then there are other, rarer ways to make money such as obvious mistakes in setting the lines, mathematical errors the house might make in setting up things like parlay cards, etc.

Pro sports bettors hope to beat the spread by maybe 3-5 points. And they may only identify 3-10 betting opportunities a week out of dozens of games they look at. So to make a living, they have to bet very big money on those games. This takes a significant bankroll,and the variance is large because the number of bets you place a year is small. Let’s say you bet 300 games in a year, with an average of beating the spread by 5%. The luck factor here then becomes significant. Your total edge comes from picking 15 games out of 300 better than chance. So a few bad beats, a fumble here or there, unexpected injuries, and the like can wipe out an entire year’s profits. It’s not a profession for the faint-of-heart.

Also, because your edge is so small it’s very hard to see if you are making correct decisions or not. A lot of ‘pros’ out there are people who have just gotten lucky for a year or two. To tell whether your decision-making is correct requires a lot of post-game analysis. Too many guys make 5 picks, go 4-1, and pat themselves on the back for being a genius. Analysis of the way the wins happened may reveal that the actual reasons for the win were opposite of the reasons the handicapper chose.

So, it requires a lot of work, including a lot of boring post-game number crunching when there’s no money on the line. It’s really a full-time job, and you’d better already know your sports very well. And you have to be very good at mathematics to apply expectations and values to statistical events and such.

Hey Sam -

Thanks, it didn’t really occur to me that the house would bet its own money.

What do you know about how the action is spread around. I know that say, the action at the Mirage is 60% for one team and 40% for another, and it is vice versa at Caesar’s, that they (along with the other casino’s) lay off their imbalances on each other.

It seems to me though, with the illicit gambling other places, in order for the market for a bet to be efficient, and so as not to create no lose situations between various (legal and illegal sports books) the spread has to be pretty even around the country. This implies that this laying off of action probably includes some interaction with illicit gambling interests.

On a side note, I think it was screwing up some posting of odds, creating a no lose situation for a bright gambler, let Sonny Corleone to beat his brother in law with a trash can lid.

Arbitrage! :slight_smile:

Yes true arbitrage, as differentiated from the risk arbitrage which became so famous in the 80’s.