How do you find the average standard deviation amongst a group of standard deviations

Cagey_Drifter · April 2, 2007, 2:01am

I have a list of items, each assocatied with a mean and a standard deviation. I want to calculate the average standard deviation. How do I do this? Is it as simple as taking the average of the standard deviations given?

Here are the numbers, if it helps:

Item 1: Mean 115, StDev 282
Item 2: Mean 332, StDev 266
Item 3: Mean 1006, StDev 605
Item 4: Mean 31, StDev 24
Item 5: Mean 449, StDev 556

Thanks for any help. This is driving me crazy.

Shagnasty · April 2, 2007, 2:12am

You can’t do that without some more info. The sample size of each is the most important missing piece but the methodology seems flawed and odd in any case. Is there any reason you want to do that? I basically got bitch slapped once by a statistics guru by trying to average aggregate data and the problem has stuck with me ever since. There are special meta-analysis statistics to do things like this but they fall under a special category and are rather advanced.

Cagey_Drifter · April 2, 2007, 2:22am

The reason I want to do this is because I’m doing some work on inventory management, risk pooling, and delayed differentiation. The idea is that I’m trying to release a product, “item A,” that is supposed to single-handedly replace items 1-5, and I’m trying to figure out how I can see how my overall standard deviations will change from this transition. But it seems from your response that I’m not even thinking about this concept correctly!

Cagey_Drifter · April 2, 2007, 2:37am

As I thought more about what you said, Shagnasty, I started to understand the how my concept really didn’t make much sense. Would the proper way to think about this be to say that the aggregate standard deviation I had from items 1-5 was simply the sum of their standard deviations?

Crescend · April 2, 2007, 2:41am

You have N random variables, X1 through XN. For each of these, you have a mean (mu(Xi)), and a standard deviation (stdev(Xi)). You want to find the mean of the standard deviations. A way of thinking about this problem would be that you want the formula of an estimator of the mean of a random variable, whose distribution is the standard deviation distribution.

Assuming that the standard deviations are of normally-distributed random variables, you’d want the mean of this distribution. At least, if my half-remembered statistics are correct.

Mikemike2 · April 2, 2007, 3:13am

It has been forever since I took statistics, but to me standard deviations as large as the ones you are getting would just mean that the results are random (not predictable).

David_Simmons · April 2, 2007, 3:27am

I agree. And I don’t think you can combine standard deviations usefully unless you have reason to believe that the instrument error and scatter are the same in all of the experiments. That doesn’t appear to be the case in your data.

It’s a little hard to believe that a mean of 449 and a std dev of 556 represents a Gaussian distribution.

Crescend · April 2, 2007, 3:34am

Maybe it’s just not standardized?

jovan · April 2, 2007, 3:34am

Not necessarily. Those numbers are meaningless if we don’t know what they represent. For instance, I use the standard deviation often for pattern analysis/recognition:

Item 1: Mean 115, StDev 282

This would tell me I can realistically classify input data between -449 and 679 as a positive match. If my test data is uniformly distributed between, say, -10,000,000 and 10,000,000, that’s actually a very narrow filter.

pulykamell · April 2, 2007, 3:38am

Is there a possibility those numbers are variance and not standard deviation? Because, like you say, those are some wacky numbers. Mean 115, standard deviation 282? It’s been years since I’ve taken stats, too, but if I remember right, that data would basically look like a lot of low numbers, and some very high numbers to push the mean up to 115 and get a standard deviation that high. To give you an idea, the data set: {0, 0, 0, 0, 600} has a mean of 120, standard deviation of 268 (population standard deviation of 240).

Those seem like rather unusual numbers to me.

David_Simmons · April 2, 2007, 3:39am

If you really want to do it, though, here’s how.*

multiply each sample average by the number of trials in the sample. Add them together and divide by the total number of trials in all the samples combined. This is the aggregate average. Using the aggregate average and all of the individual readings, measurements, or whatever, compute an overall std dev in the usual way.

All of the samples must be of the same thing, such as mens’ hat sizes in various cities, the samples must be random and the measurements taken in the same way for this to have any meaning.

*Practical Statistics, Russell Langley, Drake Publishers, Inc., New York, NY

pulykamell · April 2, 2007, 3:39am

Good point. I had not thought of negative values.

nivlac · April 2, 2007, 6:32am

I think the other replies are ignoring the supply chain problem you are tackling. The answer to your question is fairly simple. With some assumptions of independence involved, the std dev of demand for the new product will have a std dev that is the square root of the sum of the variances (squares of your std devs) of the five replaced products. Now look at the resulting coefficent of variation of demand for item A. You can easily determine the impact on inventory safety stock.

Trunk · April 2, 2007, 12:43pm

Assume X has mean mx, and variance vx, and Y has mean *my *and variance vy.

If X and Y are INDEPENDENT and NORMAL, then the random variable Z=X+Y is normal and has mean mx+my and variance* vx+vy*.
X/N has mean mx/N and variance vx/N^2.

If you want to find the mean and variance of

(X1+X2+X3+X4+X5)/5, you can use those two bits of information.

But, you didn’t indicate that these things are normal, or that your parameters come from the same sample size, or if they’re independent samples.

Your problem might be more complicated than you’ve let on here.

ultrafilter · April 2, 2007, 3:07pm

I asked a similar question a while back, and got a good formula for the total variance (i.e., the variance of all items taken as one group). As has been noted, the meaningfulness of such numbers depends on methods of collection among other things.

Chronos · April 2, 2007, 5:03pm

Above and beyond all else, remember that, though you can define a standard deviation for most distributions, it’s really only useful for a Gaussian distribution. Now, many distributions one actually encounters can be reasonably approximated by a Gaussian, but not all, and sometimes, that can get you into trouble. Unfortunately, from just the data you gave us, we have no way of telling how Gaussian your distributions actually are.

Chronos · April 2, 2007, 5:37pm

Come to think of it, another word of warning: Even if these things are all (approximately) Gaussian, the addition in quadrature that nivlac talks about assumes that they’re all independant. But if these distributions are demands for products which are sufficiently similar that they can all be replaced by the same thing, then it’s likely that the demands will be highly correlated. That is to say, when demand for product A is high, it’s likely that demand for products B, C, D, and E will also all be high. In this case, you want to just add the standard deviations together straight.

Topic		Replies	Views
Statistics Factual Questions	2	576	January 23, 2001
Standard deviation Factual Questions	10	1228	April 8, 2005
Quick and Easy Explanation of Standard Deviation Factual Questions	17	28394	November 21, 2003
Standard Deviation: How do I use it? Factual Questions	11	1084	July 6, 2003
Variance of combined samples? Factual Questions	3	4227	February 17, 2005

How do you find the average standard deviation amongst a group of standard deviations

Related topics