Lying with statistics

We all know the old quote (Twain or whoever). What are your favorite (or most hated) examples. Many will be political but i’d like to keep value judgements out of it and focus how the statistics are fudged to make them say something they don’t really support.

My current one is, of course, unemployment is higher now than when Obama took office. I don’t go a day without hearing it. Well yes, it was 7.8% when he was inaugurated. But it was 6.1% in September 2008 and 9.4% by May 2009 and over 10% by July so the trend wasn’t exactly positive.

One of my favorites from the past was “Al Gore’s house in TN uses 254 times more energy in one year than the average person’s house does per month.” People of course focus on the 254 and completely miss the juxtaposition of a year and a month. So, the statement is true, but it’s really a lie. (there were lots of other fudges that brought the actual value down to like 3X average but the year vs month was the really egregious part)

It’s misdirection. The media is always using misdirection, and I don’t credit any statistic I can’t see the data for and it’s collection process.

I think “lie” could fairly be taken to include everything said with the intent to mislead, so I think the 254 comment could count; yes, it’s a judgement, but being too fussy with the definition of lying encourages saying things intended to mislead in such a clever way that they are not lies in a purely logical sense.

Since a confidence level 95% is a very common criterion used by statisticians, if you can dig up 2000 statistical statements, you should have about 100 incorrect ones to choose from, for which the data and its collection process are proper.

My favorite now is a poll that shows 15% in favor of something, 20% against, and 65% not sure. Now the pros say 80% don’t oppose it, and the cons say 85% aren’t in favor of it.

I’ve mentioned ths old story before: A group of seminary students were polled about whether or not it was alright to smoke while praying. They responded 100% that it was not. When the question was rephrased as whether or not it was alright to pray while smoking, 100% agreed that it was.

I seem to recall a new story years ago (and I doubt it is unique) along the lines of “20 percent of Americans live in poverty!”

Which sounds bad until you realize that the definition of poverty was litterally the bottom 20 percent (economics wise) of the population.

Or whatever the actual percentage was…

I like to point out that half of all kids at performing below average at any given school.

People have come to think of “below average” as worse than it really is and that “below average” is unacceptable. Oh my god, how can half the kids be performing at an unacceptable level?

And then the modal score of Australian cricketer Don Bradman was (probably - I don’t have all the figures to hand) 0.

He was out 70 times in his international career, of which 8 were for 0. Since his other scores spanned a wide range, up to 334, it’s perfectly possible he was never out for any given non-zero score more than once, and very likely no other score as many as eight times, making zero his commonest. The more usual measure of a batsman’s prowess is the mean; Bradman’s is 99.94 and almost any other cricketer would be delighted to have an average literally half as good.

Mark Twain’s full quote says something pretty different from what people normally think he said:

I like to point out that the above statement is incorrect. And changing “average” to “median” or "below’ to “no better than” won’t fix the problem.


60% it works EVERY time!

My favorite has always been the Pepsi taste challenge from many years ago on TV.

What I admired was the clever wording of the stated facts at the end. Having shown a few avowed Coke lovers choosing Pepsi in a blind test, the voice over said, “In recent blind taste tests the majority of Coke lovers preferred the taste of Pepsi.”

So what does that mean? Let’s say I break up all my Coke lovers into groups of 3 and call each group a “test.” Whenever 2 or 3 people prefer Pepsi that is one test where “the majority of Coke lovers preferred the taste of Pepsi.” So how many do I need to make my tagline true? Only two, even if in 2000 cases it’s not true.

Beautiful. An apparently compelling stat that tells you nothing at all.

That’s great. I’m stealing this.

Similar to the OP, there was This critique of a claim by the Obama compaign on Good Math Bad Math not too long ago:

By looking alternately at the number of donors vs. the amount of donations they’re obscuring the actual differences between the campaigns. It’s entirely possible that both of those statistics could describe the same campaign.

One that I used to see in charts for managed funds or brand vs brand performance is fooling with the Y axis. The variance might only be 2%, so change the unit to dollars, say, or days, whatever, and then start the axis at a few points below the worst performer. That way the 2% variance might take up 70% of the axis.

Not sure if I explained it very well and don’t feel like looking for an example, sorry.

"Figures don’t lie, but liars figure. "

Any report of a medical finding that says X “doubled the risk” of Y, without revealing that this means it took the risk from 0.000004% to 0.000008% over the course of a lifetime. Technically, it’s true. Functionally, it’s scare-mongering.

Seems like a good place to drop in this XKCD.

It varies from subject to subject, but typically, 90% or more of the kids at my local school are performing above average. Really. See if you can work out how.

It appears to me as if Shark was comparing students at one school only against each other while you are using some standardized performance chart or evaluation.