# a very simple question about statistics

I find statistics very interesting. It’s amazing how a pollster can interview only a thousand people and so accurately predict how millions of people will vote in a particular election. It confuses me though and I wonder if someone could enlighten me to a degree about what the numbers mean sometimes.

Recently, I was in my doctor’s office and he was explaining to me how this new blood thinner, Xarelto, is so much better than the old standard of Coumadin. He showed me stats on reccurrences of heart attacks and pulmonary embolism. I don’t remember the numbers exactly but they were very low for each drug. So let’s just say that Xarelto patients who have heart attacks were about 1.7% and Coumadin patients with the same issues were 2.5. What exactly is the difference? To me both numbers are fairly negligible.

When I was on anti-depressants, I had certain side effects. I had the typical complaints that other users have when it came to sexual issues and dry mouth. So why is it that the pre-trial studies found that people taking the drug (as opposed to the placebo) who had dry mouth issues or sexual issues are approximately 15% or so? That number does not seem very significant but I’m assuming that it is a high number and which signifies that this particular side effect is a common one. But since when is 15% a common issue? I would think that the number should be at least a majority or reaching one (about 40% minimum).

I assume that these numbers are relative in some way. If a batter gets a hit 33.3% of the time, he’d be among the best in the league, but if a basketball player hits 33.3% of his free throws, he should consider a change of profession. But nevertheless, if someone came to me and said that this pill you’re going to take may give you, say, blurry vision, and the chances are 14 in 100 that it might happen, I wouldn’t be happy but I’d still like my chances. But I’d be wrong, right?

If those numbers were real, it’s a good illustration. They both sound low, but 1.7 is only about 2/3 of 2.5%.

Put another way, the new drug reduces the chance of a recurrent heart attack by nearly a third.

You understand the number, it sounds like - 14% means 14 in 100 will experience it, on average.

You just have a much higher risk tolerance than most. If something had a 14% chance of giving me blurry vision, it better be amazing at what it does, and solve a really bad problem.

86.4% of statistics are made up.

Wrong about what? Wrong not to be happy? How could you be wrong about that? Wrong to be willing to take the pill? How could you be wrong about that?

Your risk tolerance for blurry vision is your own to decide. If you think the benefits of the pill are tolerable even with a 14% chance of blurry vision, great; it works out for you. If not, that’s also fine; different people have different preferences.

Is the blurred vision a permanent result? If discontinuing the medication would clear it up there would seem to be little actual risk involved.

Just start the new medication at a period of time when you know you’ll be spending the next several days around home. If no side effects, cool, you now have a more effective treatment.

The skill for predicting elections is finding a representative sample. All of the professional polling companies spend a ton of time searching for the correct mix of demographics that are representative of the population they are studying. If you are able to collect data on everyone, that is a census. Most of the time, though, you only can work on a limited number of subjects; that is called a sample. Statisticians study a discipline called “Sampling” in their coursework. They use those techniques to find the correct mix of race/ethnicity, blue collar, white collar, etc. to estimate the results; hopefully the results are extensible to the larger population. As you can imagine, sampling is often imprecise, but it’s useful to have estimates under certain conditions.

That’s a good observation. The difference between the numbers only becomes important when you’re the one having a heart attack.

The numbers may sound insignificant, but they are super-duper significant to the US FDA or the EMEA or other drug regulatory agencies. Cardiac clinical trials are massive. It’s not uncommon to have >10K patients participating in cardiac drug studies from all over the world. With massive numbers like that, it’s entirely possible to detect whether Xarelto patients have strictly less than the risk of heart attacks from Coumadin. If you’re talking about the Pearson study that compared Xarelto (rivaroxaban) to Coumadin (warfarin ), here is the relevant comparison:

This was a ~15K patient study. What that is saying is that there were slightly fewer MIs among the Xarelto pts but the difference wasn’t outside the realm of probability to happen by chance.

“Common” may not be the right word, but it’s the correct idea in context.

When a new drug is under review by FDA or EMEA, the drug is scrutinized with respect to its side effects. Side effects are always listed in terms of frequency among the study patients, but they are also listed in terms of severity. For example, dry mouth (xerostomia) can range in severity from “well, I could sure use some water” to “this patient can’t produce enough saliva so we have to feed him through a tube.”

**sylmar **also brought up the issue of whether the side effects are permanent or not. That’s another important piece of the puzzle. If they’re irreversible, that’s an important thing to know, right?

A drug that is available on the market has to list all of the most common side effects in its label so that a MD and a patient can make an informed decision about whether the benefits of taking the drug outweigh the risks of side effects. The most “common” side effect for your antidepressant may be “nothing” but it’s more likely that a given patient will have at least one of the multiple side effects that are listed. For full disclosure, all of the side effects need to be listed in the drug’s label. It also gives you information with which you can compare different antidepressants to each other. The early ones (e.g., MAOIs) had a long list of side effects but the subsequent generations are much cleaner.

Like I said above, those numbers are there so that you have the full picture of the up-to-date data. Given the profile of side effects, each MD and patient can try to make the best personal decision. Maybe the antidepressant was suitable for you, but for someone who had salivary cancer and has terrible trouble with dry mouth, it would exacerbate an already significant physical problem.

The point of the numbers is to enable consumers and physicians to make informed choices.

Off the top of my head, when messing with statistics, you’ll see 3 basic “types” of percentages.

1 - The overall %. That’s the 1.7%, 2.5%, etc. And you’re right. In the grand scheme of things, 1.7% and 2.5% is not far off from another.

2 - The relative %. The nominal difference between 1.7 and 2.5 is only 0.8, but percentage wise, from 1.7 to 2.5 is a 47% increase.

3 - Significance/confidence. This edges on towards more complex statistics. A lot of stats relies on if a difference exists or not. It’s a different question to ask the extent of that difference. Someone can say they’re 99% sure that pill B is more likely to cause heart attacks but that does not mean the same thing as pill B is 99% more likely. Confused? The first statement simply states there is a difference. The second states the severity of the statement. The first statement could imply that I’m 99% sure that there is a difference, even if the difference is 1%. The second states that there is a HUGE difference, but doesn’t tell you how confident they are in that statement.

Just one warning. My doctor discussed switching from coumadin to another blood thinner whose name I have forgotten. Wasn’t the one you mentioned either. But one downside is that if you are in accident, the effects of coumadin can be reversed essentially instantaneously by injections of vitamin K; the possible replacement, aside from being an order of magnitude more expensive could be swiftly neutralized. You might want to discuss this with your doctor before switching (and for all I know this is a new one without that problem–just discuss it).

I see most of the responses are focusing on diseases and medications, and not so much on basic statistical precepts.

“A statistician is a person who thinks, if half your body is on fire and half is frozen in a block of ice, on the average you are very comfortable.”

The FIRST statistics lesson that every lay (non-mathematician) person needs to know:

First, I note the OP’s questioning of what’s significant. It should be emphasized, right from the start, that the word “significant” has a very specific and technical meaning in statistics, which doesn’t quite correspond to the common everyday meaning.

“Significant” means, basically, that some measurable result is reliable and repeatable, and thus that measurable result is based on actual causes (known or unknown) rather than being simply random variation that could happen in any experiment. One of the fundamental problems of statistics is to distinguish meaningful results in any experiment or any measure, from random variations, and there are whole bodies of theory and formulas for doing that. There is no requirement that a measure be very large or have any practical importance to be “significant” – only that it be demonstrably not (probably) just random noise.

Example: A certain new experimental cancer therapy seems to keep patients alive longer. A thousand patients are given the treatment, and a thousand others are given a placebo. Their remaining times alive are recorded. It is found that on the average, the treated patients lived 3 (three!) weeks longer. The treatment costs \$450,000 per patient.

Is this “significant”?

Well, some patients lived longer and some died sooner. If you ran the same experiment again, you might just as well come up with 6 weeks next time. Or 4 weeks. Or the treated patients on the average died 2 weeks earlier. There so much variation in how long people live, you could have gotten just about any result, right? (Say, maybe, anywhere from -10 weeks to +10 weeks.)

Well, supposed you ran the experiment again, with another 1000 treated and 1000 untreated patients, and got a similar result – anywhere from 2 to, say, 5 weeks longer life on average. And suppose you ran the experiment yet again and got similar results. And again. (For a treatment that costs \$450,000 we’re just talking theory, of course.) If you can get more-or-less consistently repeatable results like that, then you would be forced to conclude that the treatment is in fact doing something to keep patients alive longer – even if it isn’t doing very much. And that is what statisticians mean when they use the word “significant”.

But that doesn’t mean that the treatment is necessarily doing very much. Just that it’s demonstrably doing something.

And the concept of “significant” doesn’t take any practical usefulness into account. What’s it’s worth to live (an average probable of) three weeks longer? Does the treatment have obnoxious side effects? Is an extra (probable) three weeks worth \$450,000?

The term “significant” is extensively abused, especially in advertisements for drugs and other medical or quasi-medical treatments – taking advantage of this difference between technical meaning and common meaning. How often do you hear that some drug “significantly” improves your cholesterol readings, or your weight loss numbers, or whatever?

It could well be bullshit. A diet supplement that consistently causes users to lose (on the average) 2 pounds every six months could truthfully be said to “significantly help users shed pounds” even if that only means 2 pounds every six months, and even if 43.6% of users actually gained weight. But that’s not what most people would call “significant” in common everyday usage.

In my opinion, you’re looking at it pretty well: 1.7% vs. 2.5% is a pretty small difference: that’s one person in 125. For something minor, I’d probably ignore the difference, too.
On the other hand, heart attacks aren’t minor things; if I could reduce my chance of a heart attack by 1 in 125 just by changing medication, that’s probably worth it, right (assuming no difference in side effects or cost)?

Well, now we’re not talking statistics, we’re talking about what word you want to use to describe 15%: is that ‘common’ or ‘sort of common’ or ‘rare’ or whatever.
Of course you’re right that it depends on the context: nobody would say that a guy who hits a home run 15% of his at bats ‘rarely’ hits home runs. On the other hand, probably most people would agree that a guy (must be a pitcher) who is batting 0.150 is someone who ‘rarely’ gets hits.
So, whether 15% of patients (that’s about 1 in 8) getting a side effect is ‘common’ or not is really a bit of opinion. My personal opinion is that I wouldn’t call it ‘rare’; I’d probably call it ‘fairly common’ or ‘common’.

The 15% by itself is not enough to make a decision. You have to consider the risk/reward ratio.

If you have a drug that will cure whatever incurable fatal disease I may have, but there’s a 15% chance my hair will fall out, I’ll take those chances in a minute (in fact, I did take those chances with my chemo treatment, although the chance of hair loss was much higher than 15%).

If you have a drug that will make my astigmatism better, but there’s a 15% chance that I’ll die, I will pass on that drug, thankyouverymuch.

I think there was a misunderstanding with a few early replies to my “blurry vision” example. I just made that up to express my confusion at how on one hand people who take a certain drug (Zoloft, for example) mention having a certain issue – sexual problems are the most obvious and common – but then, when you look at the stats “only” 14 percent have this side effect. It would just seem to me to be a relatively low number when in reality quite a few people complain about it.

I’ma disagree with you here and posit an even more important lesson for everyone to understand- margin of error. It’s a mathematical expression of how much a result of an experiment or poll can differ solely to due sample variation. It does not, in any way, express how reliable the poll is. The pollster can ask leading questions, can select samples poorly, or extrapolate results to something other than indicated, but none of that would change margin of error.

I say that’s more important because, I’m guessing, MoE is encountered more than significance in everyday life.

Okay, there are a few fundamental statistical ideas that everyone should learn about, lest we all be bamboozled by the marketers, pollsters, and slanted news shows. There’s room to argue about which is “Most Important”. I argue for an understanding of what “significance” means because the term is so routinely abused in health product marketing and advertising.

Understanding margin of error is important too, no argument. People will tend to dismiss meaningful results as being just chance, or will believe chance outcomes are meaningful, often without knowing anything about how to decide. An understanding of the idea of significance plays here too.

Drawing a valid sample that is representative of the population entails a whole body of theory, technology, and art form unto itself. The OP was wondering how a sample of just 1000 can validly represent a population of millions. Infamous blunders are possible, as Thomas Dewey and The Chicago Tribune discovered in 1948!