View Single Post
Old 03-28-2020, 09:45 PM
Settimo is offline
Join Date: Jan 2017
Posts: 78
Originally Posted by DSeid View Post
...from a practical POV you are best off considering any individual outside your immediate household circle as if they were contagious and limit your time within their 6 foot circle and the frequency of those contacts as best you can. Along with washing your hands, etc.
Totally agree with those precautions. And re the following, I wouldn't want anyone to make important health decisions based on a random person's calcs on the internet. If anything, reasonable estimates along these lines should give people good reason to follow the precautions. With that disclaimer in mind, here's one method (this gets slightly mathy):

From post #11, an estimate is needed on the number of new infections between 3 days ago and 6 days ago (inclusive)--this approximates the # of infected people in circulation (not isolating). If today is day zero, yesterday is day minus 1, etc. then we want those infected on days -6 to -3.

The dashboard site(s) (which I think are global) give regional daily numbers of confirmed cases (all the way to district/province/county, in some cases).

The lag: the dashboard sites give tested/confirmed cases. The lag is the amount of time from getting infected to getting the test result onto the dashboard. The article mentioned above suggests between 6 and 10 days from infectiousness to positive test report, so add 3 for approx start of infection. I'm going to conservatively estimate an average lag of 10 days from infection to reporting--ie todays dashboard value estimates total people infected as of approx day -10. The approach is to accumulate a few days worth of dashboard totals, and extrapolate.

To give an example, suppose I recorded the dashboard site amounts for my county/district/whatnot over the past 6 days (incl today), to give daily totals {48,71,92,110,172,221}. These would be approx cumulative # of people infected by days -15 to -10 (respectively).

Plot it: it looks exponential, so use linear regression on the logarithms of the values (site) (if the plot looks more linear, just apply linear regression to raw values (no logs)). The logs (base "e") are approx {3.9, 4.3, 4.5, 4.7, 5.1, 5.4}. Linear regression gives the equation Y=0.3X+3.9 (X=days from day -15). To get back to normal scaling, take exp: exp(0.3X+3.9)=49*1.35^X. This amounts to estimates for what the dashboard will say (in future) for days (relative to today) -9 through 0:
{49,67,90,122,164,221,299,403,545,735,992,1339,1808,2441,3294,4447} (day -15 to day 0)

The other variable is the efficiency in testing--how many with the disease actually get tested. I've seen ratios actual/tested anywhere from 6 to 11 (not sure if this includes the asymptomatic cases; the "11" was from a NYT article I can't relocate); to be conservative, use 8.

To estimate the new infections that occurred in the window, take the difference between days -7 and -3 (trailing difference--we want new infections on days -6 thru -3 inclusive; could also do between -6 and -2, or average trailing and leading):

If the county has pop 500K, then that's an estimated 2% of the population who are out and contagious. (Yikes.)

The combination of the lag between infection and test results, the exponential growth, and the testing undercount has a big effect. I would not have thought such small numbers in confirmed disease totals could translate into that high of a percentage in circulation.