Need quick statistics help on averages

Cagey_Drifter · February 5, 2013, 8:30pm

Let’s say we’re talking about comparing average test scores between boys (group A) and girls (group B), and the overall group (Group C, where Group C = Group A + Group B), and we’re looking over the period of two tests.

Without knowing how many people are in each group, is it possible for the average scores of both Group A and Group B to go down since the previous test, while the overall average of Group C goes up at the same time?

Cagey_Drifter · February 5, 2013, 8:50pm

Also, I should mention: the number of people from Test 1 to Test 2 may change as well.

RedSwinglineOne · February 5, 2013, 9:10pm

Yes, see Simpson’s paradox.

Doctor_Jackson · February 5, 2013, 9:17pm

No, there is no way for the average of the entire group both to rise when the averages of both sub groups declines. One or both of the sub group averages must increase in order for the main group average to increase.

Bob_X · February 5, 2013, 9:22pm

Ask yourself whether the totals could go down for Group A and for Group B (if the numbers in A and B don’t change, the totals must go up and down when the averages do) and yet the total for Group C goes up (but… C = A + B), and you see that this is impossible.

Doctor_Jackson · February 5, 2013, 9:24pm

I don’t think Simpson’s paradox covers the question in the OP, where there are 2 distinct groups making up one whole. Wierd things could happen within the data sets due to the paradox, but I can’t think of any way that both can decline while the main set average increases.

I am, however, prepared to be proven worng!

Bob_X · February 5, 2013, 9:25pm

Ah, well then it can be thrown off: if the two groups are of widely different average scores, and the group which does worse has much larger representation the second time around.

Evil_Economist · February 5, 2013, 9:35pm

I do not understand how this sequence of answers could happen.

Andy_L · February 5, 2013, 9:42pm

Let’s start with 3 girls and 5 boys: all the girls have test scores of 90, so the average score for the girls is 90, and all the boys have test scores of 70, so the average for the boys is 80. The average score for boys and girls is 77.5.

For test 2, there are 20 girls, and 2 boys. The both girls get a score of 85 (so the girls’ average score is 85 - less than the previous test result of 90), and the boys each get a 65 - so the average boys’ score goes down as well (from 70 to 65). The group average though goes up from 77.5 to 83.2

mozchron · February 5, 2013, 9:51pm

Yes is can happen - AndyL gave a very nice example (has a mistake in the first boy’s group average, but that’s obviously a typo - the result is correct). I believe this would fall under Simpson’s paradox.

An average, by itself, tells you nothing. You need to know how variable the data are around the average, and the sample size. As a scientist, I deal with this sort of thing all the time during data analysis.

leahcim · February 5, 2013, 9:51pm

I think the confusion is due to the fact that the fact that the populations are different sizes is necessary. When I hear “average test scores”, I implicitly think about a class where the boys and girls doing the second test are pretty much the same as the boys and girls doing the first test.

Snarky_Kong · February 5, 2013, 9:51pm

Whoops.

Andy_L · February 5, 2013, 9:52pm

Oops. Thanks for noting the typo - I modified the example midstream, but missed a spot.

mozchron · February 5, 2013, 9:55pm

???

Doctor_Jackson · February 5, 2013, 9:58pm

Yep, my bad. I missed post #2.

leahcim · February 5, 2013, 10:02pm

Sure it is. What specifically do you take exception to?

It helps to consider a “test” that girls consistently, deterministically, do better at, and you are certain that girls will get 70%, but boys will get 50% every single time. Then the overall average can be as high as 70% (for an all-girl class) or as low as 50% (for an all-boy class). Increase the proportion of girls, and even if you don’t do anything else, the class average goes up solely due to the change in class composition.

Simpson’s paradox simply happens when a change due to the composition of the class overwhelms a change due to an actual change in the results of the test.

Giles · February 5, 2013, 10:13pm

Here’s a simple example. Population size of 3 stays the same, but between test 1 and test 2, boy 2 drops out and girl 2 is added:

Test 1:
Boy 1 = 50; Boy 2 = 50; Girl 1 = 20
Average for class is 40

Test 2:
Boy 1 = 55; Girl 1 = 25; Girl 2 = 25
Average for class is 35

Cagey_Drifter · February 5, 2013, 10:47pm

Thanks, guys. This is really helpful. I suspected that this could work, but couldn’t quite think it through.

Andy_L · February 5, 2013, 11:55pm

Nice example - just about the minimum case for demonstrating the issue.

Snarky_Kong · February 6, 2013, 12:36pm

Nothing, which is why I edited my post. I misread it and thought he was taking an average of averages.

Topic		Replies	Views
Is there a name for this statistical paradox? Factual Questions	13	6542	May 26, 2010
A bit surprising but not a paradox Miscellaneous and Personal Stuff I Must Share	4	208	January 29, 2025
Help Me Understand Averages In My Humble Opinion	22	3470	July 17, 2014
Help Me Understand This Math Problem In My Humble Opinion	8	1219	December 10, 2007
Statisical Confusion Factual Questions	4	844	January 15, 2008

Need quick statistics help on averages

Related topics