View Single Post
Old 02-21-2005, 10:42 PM
wevets wevets is offline
Join Date: Mar 2000
Location: hobgoblin of geographers
Posts: 4,207
Regression towards the mean is a little but more conditional than that: it does not occur in all data sets.

For example, think about regression towards the mean in its original context: heights of fathers and sons. It describes the fact that in Galton's studies, very tall fathers tended to have shorter sons (and that very short fathers also tended to have taller sons). Now let's imagine that something changes over time in the population... for example, better nutrition allows people to grow taller. By increasing the variance of height for the next generation, you will change the slope on the scatterplot of fathers' height vs. sons' height, and regression towards the mean might no longer be observed.

Related to that is the notion that heights, IQs, etc. in human populations will show more extreme outliers over time, rather than less (which you might believe if you thought regression towards the mean applied). There are larger populations of people in most successive cohorts, changes in environment, and over large periods of time, there will also be changes in genes contributing to these characteristics. So even if you could somehow control for nature v. nuture (I'm assuming here that by "nature" you mean genetics) you will find more extreme outliers over time.

Think of regression towards the mean as describing the phenomenon that without a mechanism, it is unlikely that successive data points from the same source will be outliers.