I. Just to recap this effort…
A claim was made that there were only about 400 people who were posting on the SDMB. I thought this claim was ludicrously low, so I set out to see if I could determine what the real number of current regularly active Dopers was. A regularly active Doper population would be those Dopers not metaphorically dead, hibernating or migrated to somewhere else.
After I began his effort, Ed Zotti stated that there were 7000 Dopers who have posted in the last 30 days. I don’t know if this was determined via a database query or is simply an educated guess.
Regardless, I had decided to take a scientific sample of Dopers. The methodology was to capture the Dopers who participated in threads that were on the front page of each forum at the time of the sample-taking. The sampling proceeded as follows:[ul]
[li]capture a subset of the whole population, count them, mark them and release[/li][li]two weeks later, capture a second subset of the population, and count them[/li][li]determine how many of the first sample also appeared in the second sample[/li][li]using the Peterson Mark-Recapture method, calculate how may active Dopers there are in the population[/ul][/li]
As a secondary benefit, I was able to determine how many Dopers had “big antlers” by examining the post-counts of the first sample.
II. The results, which I hereby christen as Algernon’s Reckoning…
(First of all, I’d like to thank don’t ask and Colibri for their help regarding population estimating methodologies.)
The formula for the Peterson Mark-Recapture method of population estimating is:
N = CM/R
Where:
M = The number of individuals (individual user-names) observed in the first sample
C = The total number of individuals (individual user-names) observed in the second sample.
R = The number of individuals in the second sample that are the same as those in the first sample.
N = total population size
The first sample (M) of Dopers was captured on March 11, 2004. This sample contained 2139 unique Dopers. (Note: This differs from the sample count provided earlier in this thread. The earlier count contained an error. (i.e. – I screwed up).)
A second sample © of Dopers was captured on March 23, 2004. As with the first sample, anyone who had posted in any of the threads on the front page of each forum was caught. An exception was made regarding “sticky” threads. Those sticky threads that were part of the first population sample were not used in this second sample. This second sample contained 2240 unique Dopers.
The first and second samples were compared and it was determined that there were 1259 individuals in the first sample that also were captured in the second sample. This is the recaptured group ®.
Therefore, N = (2139 * 2240) / 1259.
Using this method of estimating total population, the number of regularly active Dopers is reckoned to be 3806. This is significantly different than Ed Zotti’s stated population of 7000. Granted, according to Ed, his number represents the number of Dopers who had posted within the last 30 days. My reckoning takes two samples and projects a total population. I would’ve thought the two methods would have generated more similar results. One possible flaw in the sampling methodology I used is that perhaps two weeks is not a long enough interval between samples. One hypothesis could be that many Dopers post in streaks with long periods of absence, hence they might not get captured and counted as part of a regularly active population.
It is important to note that this represents the pre-subscription regularly active Doper population.
III. Environmental changes are going to affect the population…
Moving to a subscription mode at the SDMB effectively changes the environment. One could think about this as a significant and relatively permanent climate change. Akin to an Ice Age perhaps. Only those Dopers who are able and willing to expend the addition energy to survive in this new environment will remain. The others will either die or wander off to a new “hunting grounds”. The harsher environment may also limit the growth of the population by reducing the birth rate and influx of new Dopers.
It is too early to tell to what extent this environmental change will have on the ongoing viable Doper population. I hereby invoke the First Law of Frisbee Throwing – namely, “say nothing more predictive than ‘Watch this’.”
IV. Histogram of Doper post-counts of those caught in the first sample…
Using the following percentages as a proxy for the entire calculated regularly active Doper population, the number of “over 1000 post” Dopers calculates to 1311. (3806 Dopers times 34.46%)
(Incidentally, no hamsters were harmed in the gathering of these post-count statistics. I deliberately did this research either very late at night or very early in the morning. If I observed any indication of a slowness in response time, I stopped the queries and waited until the next opportune time.)
0000-0999 1402 65.54%
1000-1999 306 14.31%
2000-2999 182 8.51%
3000-3999 89 4.16%
4000-4999 51 2.38%
5000-5999 26 1.22%
6000-6999 29 1.36%
7000-7999 17 0.79%
8000-8999 9 0.42%
9000-9999 12 0.56%
10,000-14999 14 0.65%
> 15,000 2 0.09%
Total >= 1000 737 34.46%
A breakdown of the under 1000 group…
000-099 447 20.90%
100-199 212 9.91%
200-299 173 8.09%
300-399 129 6.03%
400-499 105 4.91%
500-599 95 4.44%
600-699 71 3.32%
700-799 71 3.32%
800-899 51 2.38%
900-999 48 2.24%