How valid is this statistical study?

“Men And Women Use Different Scales To Weigh Moral Dilemmas” Men And Women Use Different Scales To Weigh Moral Dilemmas : Shots - Health News : NPR

Study abstract (full paper behind paywall): http://psp.sagepub.com/content/early/2015/03/12/0146167215575731.abstract

I know almost nothing about statistics. How significant is this apparent difference between men and women?

Slow server ate my post, apologize if it becomes a double post.

I only read the abstract, and I don’t know much about meta-analyses. But the abstract does not mention significance. That is denoted by “p.” Properly speaking, you cannot say “how significant” something is, because it either is or isn’t; you can only be more certain you made the right decision in rejecting or accepting the null hypothesis.

The “d” is presumably Cohen’s d which is a measure of effect size. Significance tells you whether the means are likely different (or not). Effect size tells you how large the difference is. But generally speaking, reporting d means that they also found a significant difference. d ~= 0.5 is generally considered a “medium” effect size. d ~= 0.1 is small. Because it is a meta-analysis, and based upon what I expect from this type of research (one that has low power without many subjects, etc.), a medium effect size is substantial.

This says nothing about the validity of the analysis, or whether this is a good study, or whether the results actually mean anything that we care about. I gather that they are saying males and females use different affective responses rather than cognitive. I’m not sure I’d use d to make that conclusion personally, but that may be better justified in the body of the paper.

The conclusion is consistent with that of the book “In a Different Voice” by Carol Gilligan, in the early or mid 80’s, IIRC.

The effect size d is simply the difference between two group means divided by the (pooled) standard deviation. Wiki has a handy figure showing what d = 0.5, 1, 2, or 3 looks like. Note that at d = 0.5 there is a lot of overlap between the two distributions.

Going back to the study (and not knowing a damn about psychology) that there may be a real difference between the way men and women make moral judgements. However, there is still a lot of overlap: assuming a normal distribution with some “utilitarian” score on the x axis, 30% of women are at least as “utilitarian” as the the average man.

(Man I love working in a field where I can say “nope too ambiguous! don’t care!” to d < 2)