# What Are The Odds?

I often hear odds being given for various events. The lottery says there is a 1 in 6 chance of winning with their scratcher tickets. This totally makes sense because they know how many tickets there are and they know not only how many winners there are but also the prize amounts.

But what about for less obvious things. For example, I just googled the odds of winning the Powerballwith one ticket: 1 in 175 million. (I would argue that every Powerball is won with just one ticket, but that’s another topic. Or is it?)

Where do they derive these numbers? A few years ago there was a kerfluffle over a Time Magazine (?) article that claimed woman over 30 had a greater chance of being kidnapped by terrorists than marrying. How do you quantify stuff like that?

Are they just pulling numbers out of the air? And shouldn’t there be a fair number of old geezers running around who have been struck by lightning?

A reasonable number of people may have been struck by lightning, and never reported it to anyone.

The odds are based on the best available data. A lot of people have been struck by lightning, and saw a doctor for burns; this got reported upward, and someone amassed the stats.

Odds of picking the right numbers in a lottery are quirky, but you can either just assume people choose random numbers, or, with some data collection, you can actually see patterns in how people pick numbers. (A slight bias in favor of odd numbers, for instance.) The overall odds are going to resemble random choices fairly closely.

Kidnapping stats are kept pretty accurately, and marriage stats are public record. I’d have no problem accepting that the actual value of the statistic is known. Whether the statistic cited is accurate…I don’t know, but it really shouldn’t be too awfully hard to calculate it.

One of the sad things about this is that there are “Sunday Supplement” media that don’t bother to check their facts. Anybody remember Paul Harvey? I’m convinced he had a writing and research staff who threw darts at a big number board.

What are my odds of hitting a tree with my car on the way to work. Trees and miles can be quantified. How do they quantify unknowable stuff? Like lightning strikes?

If there are 300 million Americans and 600 are struch by lightning every year, then the chances are about 1 in 500,000. Divide by 80, you get 1 in 6250 over a lifetime. (That is not quite right since it ignores the possibility of being struck more than once, but these are approximate anyway.) I assume the original numbers are correct. Leaving aside the people who didn’t survive, there probably are a fair number of people walking around who have been hit by lightning, but they don’t wear sandwich boards advertising that fact.

The one about kidnapping and marriage is absurd. You all know women who got married after 30; how many do you know who have been kidnapped?

How people pick their numbers shouldn’t have any effect on their odds of winning—i.e. matching the numbers chosen by the lottery machines—as long as those numbers are chosen completely at random (which they are). Although it might have an effect on their odds of having to share the prize with another winner.

Calculating the odds of matching a set of randomly chosen numbers is a relatively straightforward exercise in probability/combinatorics, because (unlike things like getting struck by lightning) all the relevant information is known (unless you want to get really picky and try to predict the chaotic behavior of a bunch of balls in a lotto machine). One explanation is here. (I don’t know whether it’s the best, but it’s one of the first that came up on Google.)

Of course. But you are missing my point. I am not asking about the veracity of the claims. I am asking how they make them. Women marrying over 30 is pretty straight forward. We can know how many marriage licenses have been issued to women in that age range over the last x number of years vs how many women in that age range there were in those same years and have a pretty accurate projection of future activity.

But.

And this is the question: how do you quantify the completely random odds of the terrorist kidnapping? There can’t be a high enough sample group to have any kind of meaningful statistics.

Broadly speaking there are three main ways of determining the probability of something:

For things like the lottery, it is based on combinatorics. If you know how many possible outcomes there are overall, and how many outcomes match a certain condition (e.g. you winning) you can relate the two.

For things like being hit by a car, it is based on epidemiology. We know how big the population is, and how many are hit by cars each year. There will always be some uncertainty in the figures, mainly based on how we count them. If we only count hospital admissions, for example, we miss out on people who are hit by cars but don’t end up recorded as a hospital admission. We can combine different ways of studying it to remove some of this uncertainty. For example, we can survey part of the population to find out who has been hit by a car in the past year, and whether they ended up admitted in hospital. This both gives us a second way of knowing the raw number, but also lets us judge the error in the first number.

As you suggest, epidemiology gets worse for really low numbers. What are the odds of being killed in a terrorist attack? One major attack more or less each year will change the estimate considerably. A slight variation in personal circumstances will change the actual odds considerably.

The figure used informally in news reporting isn’t what people who study risk actually use. It’s a point estimate which represents a probability distribution. A more formal way to say “The odds of being killed by a terrorist are are one in ten thousand” would be to say that the 95% confidence interval puts your probability of death by terrorist attack in any given year between one in one thousand and one in fifty thousand".

The third way is to recognise that risk is not just about probability but about uncertainty. This is where you get weird effects like “the odds of rolling a six might be one in six, but it is possible I’m using a loaded die and the odds are not actually one in six”.

Under this framework when we talk about the chance of something happening, we’re really just putting bounds on how much we know. Either you are or you are not going to die by terrorist attack next year. One of those two outcomes is entirely certain based on the type of world we are in. Unfortunately, we don’t know which world we are in. Are we in the one where you die, or the one where you don’t? When we say the odds are one in ten thousand, we’re saying that we are pretty confident, but not certain, that we’re in the universe where you don’t die.

NB: Your odds of death in a terrorist attack are not one in ten thousand next year. Used for illustrative purposes only. As hopefully clear from my answer, even stating the odds in that way is not even wrong.

I think my odds of winning the Lottery are better if I’ve bought a ticket. Other than that,I got no idea.

Actually, a pretty good mathematically legitimate argument can be made that buying a ticket doesn’t improve your odds of winning the big jackpot. With buying a ticket they’re so close to zero that the difference is negligible between that case and the true zero chance when you don’t buy a ticket.

Going back to the OP: Dioptre nailed it. Some numbers can be measured & computed 100% accurately. Some numbers can be estimated with at least estimable degrees of (in-)accuracy. And some numbers are pulled directly from the author’s nether regions. Knowing which is which can be challenging.

Ok, if you want to know how they calculate the odds of winning the lottery this is a good site. I’m going to cut and paste from my own (random and probably off the wall, since I have no idea how your local lottery works and just made up the numbers) problem, which was there are 10 numbers you need to get right starting from 1 to 99 for each number:

This is the mathy part, but I think it’s pretty easy to follow:

So, they calculate the odds for any given lottery based on the above, simply changing the variables based on the type of lottery it is. They are making some assumptions in this simple problem, so making other assumptions will change the odds.

Some of your other questions were more about calculating probability, and that’s a bit more, um, fluid and difficult to explain. Here is a site that has a simple calculation of probability of being struck by lighting in the US:

Notice the assumptions and caveats. If you refine the question to more than a basic ‘Are we likely to be struck by lightning? <in the US.’ to something more specific (age, amount spent outside, occupation, region lived in, etc) then the complexity of the calculation goes up quite a bit (I had 2 years of this stuff in college and I couldn’t do the math on some of this if my life depended on it anymore :p). I think most of the media assertions of probability/odds are based on really basic calculations using very simple models and assumptions…and that’s generally good enough to convey whatever they are trying to discuss.

To late to add, but thought I’d link to this site talking about the difference between odds and probability:

So my system for doubling my chances of winning the lottery by buying two tickets isn’t going to pay off?

Mathematically, there is (at least in certain contexts) a huge difference between “zero” and “close to zero.” See: Calculus.

Technically, it will double your chances. On the other hand, if you don’t buy a ticket, you could double your chances of winning by not buying two tickets.

I understand fully about calculus, limits, etc.

The argument is essentially that “expected value” only becomes a meaningful concept in a fairly long run. IOW, when the number of trails is sufficient to expose a high likelihood of obtaining all the possible outcomes. In the very short run, “expected value” is meaningless; you either win or you don’t.

If a given example lottery has a 1-in-10-billion chance for the jackpot, and you could play 1 trillion times, then you could meaningfully talk about how playing more or less often changes your likelihood of winning the jackpot.

But for a one-in-100-billion chance jackpot game that you can play once a week for 52 weeks per year for your entire life of 80 years, your 4000 chances in 100 billion are close enough to zero that your expected value is indeed zero. At which point not playing has the same expected value but saves you the cost of 4000 tickets.

Ah, I see the issue now. The problem is that you actually know TOO MUCH statistics. You actually understand the idea of needing a big group to get meaningful predictions, which puts you way ahead of 93%* of people who use statistics in any way.

I mean, there’s no Federal Bureau of Statistic Usage who goes around preventing people from making statistically ludicrous claims [“X sports player is a choker because they lost 2 of 3 playoff games!”].

For the kidnappings thing, I imagine the author just took the population of the US, divided by the number of terrorist kidnappings in the US each year, and, after taking a rest from all the heavy mathematical work they just did, went with that number as the odds. Which isn’t very precise, but it at least is useful for getting some idea of whether one should spend time worrying about terrorist kidnapping versus drunk drivers or something.

• Figure estimated by author.

Ok, rabbit trail from my original question. Does it really double my chances? I speak of those times when the lottery gets crazy big, and co-workers pool their money and buy 100 tickets. Or 1000. Only one combination of numbers win. If you quick pick all 100 tickets, you have 100 different combinations of numbers. It is still only one combination that wins. If my 10 coworkers get the winning combo, the win is divided between them. If some other person has the winning combo, it is split 11 ways. If you have all 100 tickets with the same winning combo, there would be an advantage to buy multiple tickets (the other winner now getting 1% of the winnings as opposed to 10%.)

I have always said I will take my chances and buy my one single ticket. Are my co-workers better off by pooling?

Yes, it does, but keep in mind that double zero is zero, and double really really close to zero is still really close to zero. I can’t disagree with what LSLGuy said above.

If you mean that there are two winning tickets, one of which is one of the 10 your co-workers bought, wouldn’t the pot be split two ways (between the two tickets), and your co-workers’ half be further split 10 ways among them?

Depends on how you measure “better off.” If you buy a single ticket, and your 10 co-workers collectively buy 10 tickets, (1) you’re both out the same amount if (and almost inevitably, when) you lose; (2) your co-workers have 10 times the probability of winning, but (3) if they do win, the get one-tenth the amount you would have won, so (4) their expected value is the same as yours.