Quick and Easy Explanation of Standard Deviation

I’m trying to wrap my head around the concept of Standard Deviation. I’m not exactly a math person, so a lot of the explanations that I’ve been reading have gone right over my head. Can any one out there give me a quick, easy and concise explanation of what Standard Dev is and how I can explain it to other, non-math, non-technical people? Thanks!

Basically the standard deviation of a set of data is the (mean) average of by how much individual pieces of data differ from the mean average.

Here’s an oversimplified example you can use to help explain the concet:

If your teacher announced that the average score on the latest test was an 80% and the standard deviation was 5, then about two-thirds of the students would get between a 75 and 85 on the test.

Basically, it’s how tight around the average, each score was.

It measures how spread out the individual numbers are from the mean. For example consider the two sets of numbers
8, 9, 10, 11, 12
and
2, 6, 10, 14, 18

Both seys of numbers have a mean of 10, but clearly the second set is more spread out. To summarize these two sets of numbers just by giving the mean deprives us of a lot of information, then.

The standard deviation is technically “the mean square deviation from the mean”. The first set of numbers above has a standard deviation of 2, the second a standard deviation of 32. So summarizing a set of numbers by giving both the mean and the standard deviation gives us a fuller picture of the numbers. Basically, the larger the standard deviation, the more spread out the numbers are.

A quick-and-dirty way of explaining the standard deviation is that it’s the “average distance from the average.” Normally, when you get take a bunch of statistical samples, they’re spread out over a range of possible values; roughly speaking, the width of this spread is the standard deviation.

For example: if I had 10 froobles, each of which was 10 cm long, then the mean length of my froobles is 10 cm and the standard deviation is zero (since each frooble happens to have the mean length, there’s no “spread” involved.) If, on the other hand, five of my froobles were 8 cm long and 5 of my froobles were 12 cm long, the mean length would still be 10 cm, but the standard deviation wouldn’t be zero any more, since none of the froobles would have the mean length any more. If five of my froobles were 5 cm long and five were 15 cm long, the standard deviation would be even bigger.

Here is a non-technical explanation of the standard deviation (written for journalists, not mathematicians.)

If you’re talking telegram short:

It’s how sawed-off the shotgun is.

Suppose you’re throwing darts at a dartboard. They form some kind of pattern, centered on the target. The standard deviation can be thought of as a measure of how close, on average, they are to the target. (Making various assumptions here in order to pull a number out of my butt…). If you take the circle on the dartboard that encloses about 68% of the darts, that’s a measure of the standard deviation of the dart position - a smaller standard deviation means your throws are more accurate and you are a better darts player.

There is an error in my earlier post. The numbers I gave ( 2 and 32) are the variances of the sets of data. In each case, the standard deviation is the square root of the variance.

Although in almost every case where a non-technical person has to deal with standard deviation it is used with reference to a normal (or nearly normal sampled) distribution (AKA ‘bell curve’) , the term is not tied to that case. In a general sense, though, it always represents the ‘range’ of values from the mean (in mathematical terms, it’s the square root of the variance). There’s little to worry as long as you don’t even attempt to play with the numbers when you don’t understand the background. It’s just that some people[sup][/sup] seem to think it’s bound up with the distribution; it’s not.
[sup]
[/sup]This parallels the people who think RMS == multiply by 0.707.

Not what mc said

Just as a quick hijack…

When I was taking business statistics from a particularly terrible teacher (who didn’t speak English very well by the way), I frantically searched for an alternative source to teach myself.

I ended up coming across the Cartoon Guide to Statistics by Larry Gonick.

I won’t say that it’s the best book I’ve ever read… not even top 10, but it did a pretty good job of explaining everything. I think that it does a decent overview of the subject. (although now that I’ve passed I cannot remember anything…oh well).

— Peter Wiggen

If someone tells you the average (mean) temperature in Chicago is 58 degrees, well then you figure any time of year a sweater and light jacket should be enough cover. But if they also say the standard deviation is 12 degrees, then all we can say is the temperature is almost always between 34 and 82 degrees (within 2 standard deviations of the mean). So, better have a look outside before you decide on what clothes to wear in Chicago :slight_smile:

The standard deviation classifies your mean. If your standard deviation is large, then your mean isn’t very reliable. I believe you can be 95% certain that any value in your sample (of a normal distribution) is within 2 standard deviations of the mean.

The normal distribution is bell shaped. The larger the standard deviation, the flatter the bell.

I like that. =)

My first stats prof. had us memorize this definition.

Standard deviation is the average number of points that scores in a distribution deviate (or vary) from their mean.

Thanks, Dr. Kersher @SHSU I still got it. J :wink:

Thanks to all that responded, you’ve been very helpful.

Here’s another question then, I’ve got a software program that looks at the number of cars stolen (for example) in the different zip codes of my city. I can arrange all the zip codes side by side and do a standard deviation calculation on them. Of the 12 zip codes, 11 fall with in the the standard dev lines (i.e. above and below the mean) but 1 extends well above the standard dev line. Obviously, this means that more car thefts are occuring in that one zip code, but does it tell me anything else?

(if that’s not very clear, please let me know and I’ll try again)

thanks!

If that same area is consistently at a higher level then there are several assumptions you could make. The specifics would depend on several factors and variables that could be analyzed for causation. Variables such as local economics, business demographics (perhaps there are just more cars in the area), available law enforcement, etc.

Otherwise it might be considered an outlier (random aberration) and would generally be excluded in any general stats. pertaining to the population as a whole. A lot of stats. only include the middle range of the distribution when analyzing for coorelations.

Gotta be careful when you take a coorelational study and attempt to show causation. Coorelation doesn’t prove causation. As I said, often the interquartile ranges are more reliable for showing a coorelation because it exludes any outliers both high and low.

As in the example you gave. You could’ve had a bunch of kids go joyriding or on a road trip and they did it with “borrowed” cars. Maybe some crooked used car salesman is pulling an insurance fraud. Who knows? You’ve got to have some data to go with the figures. Unless you can verify the reason for the aberration it’s usually not worth much.

I hope I’m not helping someone with their homework. :rolleyes:

Thank t-keela ,

No worries though; I’m well out of school. You’re helping me with my job :wink:

Not a problem then. I hope it has helped. If you want to get detailed you’ll need some data. I’m a bit rusty, but stats. was one of my best/favorite subjects. It’s really pretty cool once you get the hang of it.

later~