The Straight Dope

Go Back   Straight Dope Message Board > Main > General Questions

Reply
 
Thread Tools Display Modes
  #1  
Old 02-05-2013, 03:30 PM
Cagey Drifter Cagey Drifter is offline
Guest
 
Join Date: Apr 2005
Need quick statistics help on averages

Let's say we're talking about comparing average test scores between boys (group A) and girls (group B), and the overall group (Group C, where Group C = Group A + Group B), and we're looking over the period of two tests.

Without knowing how many people are in each group, is it possible for the average scores of both Group A and Group B to go down since the previous test, while the overall average of Group C goes up at the same time?
Reply With Quote
Advertisements  
  #2  
Old 02-05-2013, 03:50 PM
Cagey Drifter Cagey Drifter is offline
Guest
 
Join Date: Apr 2005
Also, I should mention: the number of people from Test 1 to Test 2 may change as well.
Reply With Quote
  #3  
Old 02-05-2013, 04:10 PM
RedSwinglineOne RedSwinglineOne is offline
Guest
 
Join Date: Jan 2007
Yes, see Simpson's paradox.
Reply With Quote
  #4  
Old 02-05-2013, 04:17 PM
Doctor Jackson Doctor Jackson is offline
Guest
 
Join Date: Mar 1999
No, there is no way for the average of the entire group both to rise when the averages of both sub groups declines. One or both of the sub group averages must increase in order for the main group average to increase.
Reply With Quote
  #5  
Old 02-05-2013, 04:22 PM
Bob X Bob X is offline
Guest
 
Join Date: Feb 2013
Ask yourself whether the totals could go down for Group A and for Group B (if the numbers in A and B don't change, the totals must go up and down when the averages do) and yet the total for Group C goes up (but... C = A + B), and you see that this is impossible.
Reply With Quote
  #6  
Old 02-05-2013, 04:24 PM
Doctor Jackson Doctor Jackson is offline
Guest
 
Join Date: Mar 1999
I don't think Simpson's paradox covers the question in the OP, where there are 2 distinct groups making up one whole. Wierd things could happen within the data sets due to the paradox, but I can't think of any way that both can decline while the main set average increases.

I am, however, prepared to be proven worng!
Reply With Quote
  #7  
Old 02-05-2013, 04:25 PM
Bob X Bob X is offline
Guest
 
Join Date: Feb 2013
Quote:
Originally Posted by Cagey Drifter View Post
Also, I should mention: the number of people from Test 1 to Test 2 may change as well.
Ah, well then it can be thrown off: if the two groups are of widely different average scores, and the group which does worse has much larger representation the second time around.
Reply With Quote
  #8  
Old 02-05-2013, 04:35 PM
Evil Economist Evil Economist is offline
Guest
 
Join Date: Jan 2009
Quote:
Originally Posted by RedSwinglineOne View Post
Quote:
Originally Posted by Doctor Jackson View Post
No, there is no way for the average of the entire group both to rise when the averages of both sub groups declines. One or both of the sub group averages must increase in order for the main group average to increase.
Quote:
Originally Posted by Bob X View Post
Ask yourself whether the totals could go down for Group A and for Group B (if the numbers in A and B don't change, the totals must go up and down when the averages do) and yet the total for Group C goes up (but... C = A + B), and you see that this is impossible.
Quote:
Originally Posted by Doctor Jackson View Post
I don't think Simpson's paradox covers the question in the OP, where there are 2 distinct groups making up one whole. Wierd things could happen within the data sets due to the paradox, but I can't think of any way that both can decline while the main set average increases.

I am, however, prepared to be proven worng!
I do not understand how this sequence of answers could happen.
Reply With Quote
  #9  
Old 02-05-2013, 04:42 PM
Andy L Andy L is offline
Member
 
Join Date: Oct 2000
Posts: 2,845
Quote:
Originally Posted by Cagey Drifter View Post
Also, I should mention: the number of people from Test 1 to Test 2 may change as well.
Let's start with 3 girls and 5 boys: all the girls have test scores of 90, so the average score for the girls is 90, and all the boys have test scores of 70, so the average for the boys is 80. The average score for boys and girls is 77.5.

For test 2, there are 20 girls, and 2 boys. The both girls get a score of 85 (so the girls' average score is 85 - less than the previous test result of 90), and the boys each get a 65 - so the average boys' score goes down as well (from 70 to 65). The group average though goes up from 77.5 to 83.2
Reply With Quote
  #10  
Old 02-05-2013, 04:51 PM
mozchron mozchron is offline
Guest
 
Join Date: Feb 2001
Yes is can happen - AndyL gave a very nice example (has a mistake in the first boy's group average, but that's obviously a typo - the result is correct). I believe this would fall under Simpson's paradox.

An average, by itself, tells you nothing. You need to know how variable the data are around the average, and the sample size. As a scientist, I deal with this sort of thing all the time during data analysis.
Reply With Quote
  #11  
Old 02-05-2013, 04:51 PM
leahcim leahcim is online now
Member
 
Join Date: Dec 2010
Location: New York
Posts: 2,014
Quote:
Originally Posted by Andy L View Post
For test 2, there are 20 girls, and 2 boys. The both girls get a score of 85 (so the girls' average score is 85 - less than the previous test result of 90), and the boys each get a 65 - so the average boys' score goes down as well (from 70 to 65). The group average though goes up from 77.5 to 83.2
I think the confusion is due to the fact that the fact that the populations are different sizes is necessary. When I hear "average test scores", I implicitly think about a class where the boys and girls doing the second test are pretty much the same as the boys and girls doing the first test.
Reply With Quote
  #12  
Old 02-05-2013, 04:51 PM
Snarky_Kong Snarky_Kong is offline
Guest
 
Join Date: Oct 2004
Quote:
Originally Posted by Andy L View Post
Let's start with 3 girls and 5 boys: all the girls have test scores of 90, so the average score for the girls is 90, and all the boys have test scores of 70, so the average for the boys is 80. The average score for boys and girls is 77.5.

For test 2, there are 20 girls, and 2 boys. The both girls get a score of 85 (so the girls' average score is 85 - less than the previous test result of 90), and the boys each get a 65 - so the average boys' score goes down as well (from 70 to 65). The group average though goes up from 77.5 to 83.2
Whoops.

Last edited by Snarky_Kong; 02-05-2013 at 04:55 PM..
Reply With Quote
  #13  
Old 02-05-2013, 04:52 PM
Andy L Andy L is offline
Member
 
Join Date: Oct 2000
Posts: 2,845
Quote:
Originally Posted by mozchron View Post
Yes is can happen - AndyL gave a very nice example (has a mistake in the first boy's group average, but that's obviously a typo - the result is correct). I believe this would fall under Simpson's paradox.

An average, by itself, tells you nothing. You need to know how variable the data are around the average, and the sample size. As a scientist, I deal with this sort of thing all the time during data analysis.
Oops. Thanks for noting the typo - I modified the example midstream, but missed a spot.
Reply With Quote
  #14  
Old 02-05-2013, 04:55 PM
mozchron mozchron is offline
Guest
 
Join Date: Feb 2001
Quote:
Originally Posted by Snarky_Kong View Post
That's not how averages work.
???
Reply With Quote
  #15  
Old 02-05-2013, 04:58 PM
Doctor Jackson Doctor Jackson is offline
Guest
 
Join Date: Mar 1999
Quote:
Originally Posted by leahcim View Post
I think the confusion is due to the fact that the fact that the populations are different sizes is necessary. When I hear "average test scores", I implicitly think about a class where the boys and girls doing the second test are pretty much the same as the boys and girls doing the first test.
Yep, my bad. I missed post #2.
Reply With Quote
  #16  
Old 02-05-2013, 05:02 PM
leahcim leahcim is online now
Member
 
Join Date: Dec 2010
Location: New York
Posts: 2,014
Quote:
Originally Posted by Snarky_Kong View Post
That's not how averages work.
Sure it is. What specifically do you take exception to?

It helps to consider a "test" that girls consistently, deterministically, do better at, and you are certain that girls will get 70%, but boys will get 50% every single time. Then the overall average can be as high as 70% (for an all-girl class) or as low as 50% (for an all-boy class). Increase the proportion of girls, and even if you don't do anything else, the class average goes up solely due to the change in class composition.

Simpson's paradox simply happens when a change due to the composition of the class overwhelms a change due to an actual change in the results of the test.
Reply With Quote
  #17  
Old 02-05-2013, 05:13 PM
Giles Giles is offline
Charter Member
 
Join Date: Apr 2004
Location: Newcastle NSW
Posts: 12,021
Here's a simple example. Population size of 3 stays the same, but between test 1 and test 2, boy 2 drops out and girl 2 is added:

Test 1:
Boy 1 = 50; Boy 2 = 50; Girl 1 = 20
Average for class is 40

Test 2:
Boy 1 = 55; Girl 1 = 25; Girl 2 = 25
Average for class is 35
Reply With Quote
  #18  
Old 02-05-2013, 05:47 PM
Cagey Drifter Cagey Drifter is offline
Guest
 
Join Date: Apr 2005
Thanks, guys. This is really helpful. I suspected that this could work, but couldn't quite think it through.
Reply With Quote
  #19  
Old 02-05-2013, 06:55 PM
Andy L Andy L is offline
Member
 
Join Date: Oct 2000
Posts: 2,845
Quote:
Originally Posted by Giles View Post
Here's a simple example. Population size of 3 stays the same, but between test 1 and test 2, boy 2 drops out and girl 2 is added:

Test 1:
Boy 1 = 50; Boy 2 = 50; Girl 1 = 20
Average for class is 40

Test 2:
Boy 1 = 55; Girl 1 = 25; Girl 2 = 25
Average for class is 35
Nice example - just about the minimum case for demonstrating the issue.
Reply With Quote
  #20  
Old 02-06-2013, 07:36 AM
Snarky_Kong Snarky_Kong is offline
Guest
 
Join Date: Oct 2004
Quote:
Originally Posted by leahcim View Post
Sure it is. What specifically do you take exception to?

It helps to consider a "test" that girls consistently, deterministically, do better at, and you are certain that girls will get 70%, but boys will get 50% every single time. Then the overall average can be as high as 70% (for an all-girl class) or as low as 50% (for an all-boy class). Increase the proportion of girls, and even if you don't do anything else, the class average goes up solely due to the change in class composition.

Simpson's paradox simply happens when a change due to the composition of the class overwhelms a change due to an actual change in the results of the test.
Nothing, which is why I edited my post. I misread it and thought he was taking an average of averages.
Reply With Quote
  #21  
Old 02-06-2013, 11:12 AM
cjepson cjepson is offline
Guest
 
Join Date: Oct 2007
It seems that, in order for the OP's proposed scenario to happen, group membership must change so significantly from Time 1 to Time 2 that the concepts of "Group A" and "Group B" (as entities that persist across both time points) become rather tenuous. (Or else the differences across groups and across time have to be really tiny.)
Reply With Quote
  #22  
Old 02-06-2013, 04:30 PM
leahcim leahcim is online now
Member
 
Join Date: Dec 2010
Location: New York
Posts: 2,014
Quote:
Originally Posted by cjepson View Post
It seems that, in order for the OP's proposed scenario to happen, group membership must change so significantly from Time 1 to Time 2 that the concepts of "Group A" and "Group B" (as entities that persist across both time points) become rather tenuous.
That is why the groups are more nebulous things "boys" and "girls" where the groupings are still well-defined even if all the members change. The effect can not manifest if the groups are fixed in advance for all tests.
Reply With Quote
Reply



Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 08:48 PM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.

Send questions for Cecil Adams to: cecil@chicagoreader.com

Send comments about this website to: webmaster@straightdope.com

Terms of Use / Privacy Policy

Advertise on the Straight Dope!
(Your direct line to thousands of the smartest, hippest people on the planet, plus a few total dipsticks.)

Publishers - interested in subscribing to the Straight Dope?
Write to: sdsubscriptions@chicagoreader.com.

Copyright 2013 Sun-Times Media, LLC.