# I made not be standard, but I'm a real deviant (math question)

Talk to me like I’m four. Use small words. Talk to me like I need to count on my fingers.

I tried using this formula to figure out the standard deviation of an arbitray set of numbers, and tried checking my math with a coworker.

She said it looked right, but forwarded to a statits… stastist… Real Smart Guy. He said it was wrong, the answer should be 11.970797801316333. A test in SQL bears this out. But he won’t explain why.

Where did I go wrong? Can it be explained to me without using fancy mathematical nomenclature?

Here:

You need to include the differences between all the data points and the mean.

So, the differences between them are 11.6, 6.6, 3.6, 2.4, & 19.4. The variance is (11.6[sup]2[/sup]+6.6[sup]2[/sup]+3.6[sup]2[/sup]+2.4[sup]2[/sup]+19.4[sup]2[/sup])/(5-1)=143.3

& then the deviation is (143.3)[sup]1/2[/sup]

Ah. Thanks!

I was wondering how just the lowest and highest values could show anything meaningful.

As I have datasets with over 1000 values, I think I’ll just let SQL do the hard work for me.

Use a table of random numbers or some other method of selectring a random sample and choose a sample of 20 or so. Figure the statistical parameters of that sample to get a close estimate of the population parameters.

I suspect that will be closer than entering so many data points what with entering errors an all.

IIRC, the Mean is the average value of all the numbers, and the Standard Deviation is the average “distance” from that Mean.

Sort of. To get the standard deviation, you first square all of the distances, then average them, and then take the square root. There are pros and cons of using this method compared to averaging the absolute values, and folks who use statistics a lot will argue about which to use when. The same is also true for mean vs. other measures of “typical value”.

Try saying “abominable statistics” ten times in a row, fast.