Exactly. As someone who works with statistics quite a bit it is always annoying to see the common and flagrant misuse of statistics that occur every day. Most of this is because the average person really doesn’t understand statistics that well.
First of all, correlation does not mean causality when there are other independant variables. I learned this in a very early statistics class where the professor statistically “proved” that lipstick causes breast cancer.
Consider the following analysis. Go survey a large number of people who have had breast cancer and find out how many of them use lipstick. Now, go survey a large number of people (of similar age, economic bracket and so on) who don’t have breast cancer and find out how many of them use lipstick.
You will find that many more of the people with breast cancer use lipstick than people who don’t. Therefore, there is a correlation between lipstick use and breast cancer.
Does this mean lipstick causes breast cancer. No. The analysis is flawed. Notice I never mentioned the gender of the people in the survey. The random sample with breast cancer will be mostly made up of women. The random sample of people without the cancer will contain both men and women (actually, slightly more men). Since women are more likely to wear lipstick then men, of course the sample that contains more women will have a higher percentage of lipstick users.
You can do a similar analysis to prove that after-shave lotion causes prostate cancer. The details are left as an exercise for the reader.
OK, this one is fairly obvious. But, this sort of thing happens all the time. By carefully selecting your samples and the variables you select you can imply a correlation between almost anything and anything. And it works for several reasons.
First, most people don’t realize how statistics really work. Second, most people also don’t think much beyond what they are told, especially if what they are being told reinforces beliefs they already have.
In my example above I’m sure at least a few of you didn’t catch the gender ommission in my description. If the conclusion I was reaching was similar to one you already held you would probably have ignored it anyway.
Here’s a better example. Suppose I’m a computer manufacturer. I (obviously) want to sell computers. I do a survey of households with school age children that have home computers and those that do not. I find that a greater percentage of those households with computers have their children go on to college than those who do not. (Let’s say, for example, that 75% of children from homes with computers go on to collges while only 50% of those without computers do.) I start running ads saying that “children with computers in their home are 50% more likely to go to college than those without”.
I probably sell quite a few computers this way. I may even convince the government to buy a bunch of my computers and make them available to schools and homes so more children have access to computers.
Valid survey? No. What is the economic situation of the households in my survey? The higher the household income, the more likely the household is to have a computer. Similarly, the higher the household income the more likely the children are to go to college. (College is expensive, you have to be able to afford it.) What is the real cause; computers or income? Correlation, yes; but causality?
Want me to do the same survey to prove that children in single parent homes are less likely to go to college than those in two parent homes? I didn’t think so.
My point here is that these last two examples are probably topics that at least some of you have opinions on. Possibly strong opinions. If a survey backs up an opinion you already have, you are likely to accept it as true without questioning the details of how the conclusion was reached.
I could go into the psychololgy of statistics as well, but suffice to say that, in general, people attribute more validity to data that supports their point of view and less validity to data that contradicts it. (It is also much easier to cause a person to form an initial opinion than it is to change one they already have, but that is way outside our main topic here.)
However, psychology does lead into my third reason that people accept bad statistical analysis. There is a strong urge in people to see patterns in things. I once participated in a study where dot patterns were projected onto a screen briefly (about a second) then we were asked to write down what the patterns were. Some of them were obvious, but others were more obscure and a few were simple random collections of dots. We saw patterns in the majority of them and a lot of us saw the same pattern in the random dots. The results were interesting.
Tying this back to the tornadoes at the start of the thread… Assume that the occurance of tornadoes is random and evenly distributed through the year (it isn’t) and that an average of, say, six tornadoes occur per year in your area. (I don’t know where Notthemama lives, but that number is probably low for Kansas and high for Alaska, but we’ll use it.)
So, with the numbers above, on any given day there is roughly a 1 in 60 chance of having a tornado in your area. That’s one every other month on the average. Pretty good odds, actually.)
Now, how many historical events do you think I can find for any day of the year? Checking a “this day in history page” for today (April 21) I find that today is supposedly the day Rome was founded by Romulus and Remus (753 BC), that the battle of San Jacinto occured (Texan war of independance, 1841), the Red Baron was shot down (1918), Stalin’s daughter visited New York (1967) and the protests started in Tiananmen Square (1989). Today is also the birthday of Charlotte Bronte, Anthony Quinn and Queen Elizabeth II, just to name a few.
Oh yeah, it’s Good Friday too.
So, if an earthquake hits somewhere today, will it be because of Tiananmen Square? Doubtful. But someone, somewhere will note it. And a few people will believe it.
Finally, to answer the original question. Find the average number of tornadoes (or days with tornado warnings) your area has had over the past few years. Divide that number by 365. That is the rough odds of having a tornado warning on any given day.
There is a list of historical occurances for any given day.
Coincidences happen.
Sorry for the long rambling post. Hope this is useful to someone.
“Sometimes I think the web is just a big plot to keep people like me away from normal society.” — Dilbert