Paying for a membership has somewhat brought me out of hibernation. Eventually I made my way to ATMB and what do I find?
A fantastic thread. I enjoyed it very much. I even learned the name for a population-capture technique I have used a number of times in the past. And it’s interesting to see that the 10% estimate, as of pre-subscription, was still holding up pretty well.
(Note to some: the definition of “active member” is crucial. IIRC, I originally proposed two measures. One involved the number of posts in the past month. The other involved posting rates, on the grounds that “dead” members would see that gradually decline.)
But of course we now live in interesting times. The entire membership model is currently being shaken to its core. So let’s consider what may be appropriate for the future.
Ignoring the question of active members for now, let’s consider the pure growth of the board. Basically, we have two factors to consider in the board’s future growth. One is the birth rate and the other is the death rate. Combine these and we get:
dP/dt = b(t,P) - d(t,P)
where b and d are functions that depend on the time t and the population at time t.
Here’s a common model: assume that birthrates and deathrates, and therefore the per capita growth rate, remain constant over time. This gives you the exponential growth model:
dP/dt = k.P
This is Thomas Malthus’ much-maligned approach. It sucks for models of competing resource but may just about be appropriate in the short to medium term for this board.
Solving for P is pretty straightforward: P = exp(kt). (We can bring in a boundary condition for time t=0 to solve for the integrating constant, but let’s ignore that as it doesn’t affect what we’re doing for now). We can identify when we passed a given size by looking at member x. So we passed 10k on 7/10/2000, we passed 20k on 13/3/2002 and we passed 30k on 17/1/2003.
If our model is correct then P(t)/P(s) = exp(k[t-s]). So, using months as our base time unit, 20,000/10,000 = exp(17.k), which makes k= 0.0408.
If this is correct, we should be able to test for 20k to 30k, which is a 50% growth. Exp(0.0408 x 10) = 1.50, which is 50% growth! So apparently the growth rate was spot on for this period too!
We passed 40,000 on 3/10/2003, a 33% increase since 17/1/2003. exp(0.0408 x 8.5) = 1.41, which is also nearly spot on. (More accurately, we should probably use the entire time period and state that k = ln(4)/36 = 0.0385. In this case, and using more accurate times, we get 94%, 48% amd 39% expected growths compared to the 100%, 50% and 33% expected)
I suggest that we take time zero to be the 10k-point for now. The full equation should really be
P = P[sub]0[/sub].exp(0.0385.t)
So we take P[sub]0[/sub] to be 10,000 and t is measured from 7/10/2000. The nice thing about this is that we can simply rescale by 1E-4 and ignore the P[sub]0[/sub] completely.
At the moment we are at time 40.25. The model predicts 47,000 members. We have 45,000. It is 4.4% too high. I’d say that’s pretty good. Of course, it also means that we should probably rescale before making any predictions to reflect most recent knowledge.
Now how about active members? Well why do members stop posting? One possibility is the competition for natural resources, in this case bandwidth and, of course, attention. In this case we might suppose a logistic model. But I suspect that a more likely case is boredom. In this case, I propose that the dormant membership also increases according the the exponential model, with constant d. The active membership will then be given by:
{exp(0.0385.t) - exp(d.t)}
and it remains to find “d”.
We could go ass-backward and assume what we want to find. i.e. assume that 10% are active at any given time. In this case:
{exp(0.0385.t) - exp(d.t)}/exp(0.0385.t) = 0.1 for all t.
so 1 - exp((d - 0.0385)t) = 0.1
(d - 0.0385)t = ln(0.9) ( = -0.105)
… but the solution for d will be dependent on t, which is not appropriate for a exponential model.
So if the exponential models are appropriate for membership and dormancy, there can’t be a static proportion of active members. One of these two assumptions are broken. I’d be more inclined to believe the exponential growth – as demonstrated, it is pretty good for membership growth and there is no reason to assume otherwise for dormancy. It’s just one of those little quirks of nature that it tends to work pretty well. So back to the drawing board.
Instead, let’s suppose that at the time of kabbes’ hypothesis (5/11/2001, or t = 13), there really were 10% of posters active. As it happens, that was pretty much exactly the date Algernon joined. He was poster number 18,799. Call it a nice round 18,800. So 1,880 were active.
And as of mid-March 2004 (t = 40.25, say), there are 3,806 active of 45,000 members precisely.
So we have: at beginning November 2001, 16,920 dormant members and at mid March 2003, 41,194 dormant members. As before, d = ln(4.1194/1.6920)/27.25 = 0.0395.
How many were active at time t = 0? Who knows. The current trend, however, is that it would have been in excess of 10%, say 15% at a guess. 85% dormant is 8,500. So dormant at time t is:
8,500 x exp(0.0392.t)
This means that the number of active members at time t is:
10,000 x (45/47 x exp(0.0385.t) - 0.85 x exp(0.0395.t)}
(Rescaling by 45/47 to reflect today’s membership)
Note that at time t = 40.25, this gives 3,824 active, which is pretty much what we wanted, indicating that the 15% estimate is pretty good.
An interesting feature of this is that it will peak. Differentiating and setting to zero, this would have been at time t = 93.35, or approximately 4.5 years hence. After that it would decline. But 4.5 years is a lifetime in internet terms. It’s more likely to be an inadequacy of the model at longer periods than a true reflection of reality.
Anyway – I can end with something to test. If subscription had not happened, I would have anticipated pretty much 71,500 members by exactly one year hence. Of these, I would expect that 4,624 would have been active (6.5% of the membership). So let’s wait and see what subscription does to it and we can discuss the effects explicitly.
pan
PS: Ed’s estimate of 7,000 active instead, giving 38,000 dormant, intuitively doesn’t feel right – I would expect the proportion dormant to be growing, not shrinking, with age as the board becomes more and more impersonal. But we can certainly use Ed’s figures: d = ln(3.8/1.6920)/27.25 = 0.0297. However, in order to get this equation:
10,000 x {45/47 x exp(0.0385.t) - k x exp(0.0297.t)}
correct for time t = 40.25, we have to assume that k > 1, meaning that there were more active members than members at time zero. This is clearly ridiculous. Therefore either the exponential growth model is inaccurate, despite all the above evidence to the contrary, or Ed’s figure is way optimistic. I’m inclined to believe the latter. FWIW, Algernon’s figure seems to me to fit about right.