Yet another statistics question: combining tests of significance

brossa · September 28, 2008, 8:48pm

I’ve been doing some reading about meta-analysis techniques that combine tests of significance from multiple studies, all of which are testing the same null hypothesis. Fisher’s combined probability test is perhaps the first such technique, but it is more sensitive to studies with low p-values compared to high p-values. Stouffer’s Z-transform, on the other hand, is symmetrically sensitive to studies with low and high p-values. However, it treats all the studies to be combined as though they have the same power.

The weighted Z-method allows tests to be given a weighting factor so that they have an unequal influence on the final combined p-value. In reading about weighting, it seems that there is no consensus as to what the weighting factor should be. To quote a few references: “to be decided upon by the investigator”, “based on elegance, internal validity, and ecological validity [of the various studies]”, “arbitrary”, “weight each by its degrees of freedom”, “ideally…weighted proportional to the inverse of its error variance”, and “calculated from the sample sizes of each study”.

So, are there some sort of guiding principles that would allow one to make an informed decision as to what weighting factor to choose for a particular meta-analysis? Are some weighting schemes more ‘conservative’ than others? If I want to weight studies based on n/s^2 instead of n, for example, how might I justify it, or vice versa? Does one perform an unweighted test of combined significance first, and then decide whether to involve weighting?

Chronos · September 28, 2008, 10:00pm

This is the one I’ve seen done most often in physics. The idea is that, if you fit a function to a data set, and then generalize the function to add degrees of freedom, you’ll always improve the fit. For instance, you’ll always get a better fit with a higher-order polynomial than with a lower-order one. But the features in the data which make the higher-order fit better are likely to just be noise, so you don’t want to use the higher-order fit unless it’s really justified. A higher-order polynomial has more degrees of freedom, so by normalizing to the number of degrees of freedom, you’re penalizing the higher-order fit, so you won’t use a higher order unless it’s really justified.

brossa · September 29, 2008, 4:04pm

Thanks, Chronos.

Topic		Replies	Views
Can you help me understand the statistical concepts in this depression study? Factual Questions	11	1730	May 21, 2014
A question regarding metastudies Factual Questions	4	694	May 20, 2004
Wheigted Data in Statistics Factual Questions	8	970	August 2, 2005
Help with Statistical Significance Factual Questions	18	1759	July 19, 2011
Statisical Confusion Factual Questions	4	854	January 15, 2008

Yet another statistics question: combining tests of significance

Related topics