Can anyone explain the math to me?

DSeid · April 4, 2020, 5:25pm

The uncertainty goes beyond even the broad confidence intervals because there is no way to know how much confidence to have on some basic assumptions to very broad ranges.

cmosdes · April 4, 2020, 8:11pm

I’m assuming the models that the OP linked to were using each state as an isolated set. But you’d have to ask the people doing the data crunching by state.

Is there ANY model you know of that is taking into account travel into and out of the area? Projection rates for the UK, US and every other country probably rely on very little perturbation from outside influences.

I don’t know what the model is using to determine the peak, but my guess would be transmission rates as more are infected and recover. I just can’t see this thing going to near 0 with just social distancing, nor why there would be such a drastic difference between states otherwise.

I screwed up because I summed up the number of cases wrong and assumed the peak would happen at something near 50% immunity.

I’d like to learn if the transmission rate drops as infection/recovery rises and if that would explain the drop in the curves, not just social distancing.

Stranger_On_A_Train · April 4, 2020, 8:18pm

I would go beyond that and say that owing to a lack of testing of a representative sample across the population, we know that the data that is available regarding the number of infected people is a gross underestimate, perhaps by an order of magnitude. And any model based upon strict adherence to isolation guidelines is going to underestimate serious cases of COVID-19 and number of of deaths because many people simply aren’t practicing isolation or distancing even in states where there are orders and guidance in place.

This epidemic could be stopped in its tracks inside of a month if all actually non-essential people just self-isolated and stayed in the home, and we deployed accurate antibody testing that was readily available so people would know when they are immunized. Neither of these things is happening, and so the unknowns about how this epidemic will trend and when presentations start leveling off is total guesswork at this point. All we can really do is look at countries that responded with approximately the same degree and speed of isolation measures as the US and qualitatively extrapolate from there. And on that basis, we are looking more like Italy and Spain than South Korea or New Zealand.

Stranger

scr4 · April 4, 2020, 10:40pm

If you know how many have been tested, what criteria were used to select people to be tested, and how many of those tested positive, you could estimate the true number of infected people. There wouldn’t be precise, but it may be reasonably accurate.

Euphonious_Polemic · April 4, 2020, 11:00pm

I have heard that the newest models are having to account for the tendency of people in certain areas to believe in nutty conspiracy theories, that will cause them to behave in ways contrary to what the experts are telling them (eg. if a group believes that 5G networks cause the virus, they will feel they can congregate in groups with no problem).

In the past, models operated under the assumption that most people were intelligent, and would do what medical people told them. Not anymore. Models have to be adjusted accordingly.

Andy_L · April 4, 2020, 11:23pm

No problem. Am I right in thinking that your question is about how a model can have a result that shows a peak in infections before all people have been infected (i.e. how a model can have a final total number of infected people that is significantly less than the total population)?

If so, one way to think about the situation is that social isolation creates many many subpopulations that don’t interact (or really, interact very rarely compared to how people interact inside each subpopulation) - so while some subpopulations end up with almost everyone getting the disease, other subpopulations never encounter the disease at all (yes, this is a bit of a handwaving explanation, but captures the benefits of social distancing well enough, I think).

Chronos · April 5, 2020, 12:13am

There hasn’t been extensive testing in the US. But there has been in other populations, such as Korea and the Diamond Princess, and the data from that extensive testing can be used to calibrate what we know from the limited testing we’ve done here.

Stranger_On_A_Train · April 5, 2020, 12:26am

Well, the big unknown is the number of asymptomatic or marginally symptomatic people who have not (and for the most part, cannot) gotten an antigen test. I did some casual modeling early on with one of the parameters being the number of asymptomatic or presymptomatic with ultimate mild presentation were spreaders. Even assuming half of all infected people were asymptomatic or had symptoms not significant enough to report I got ridiculously high R[SUB]0[/SUB] values. When I made only symptomatic infected in a range of 10%-20% infected the R[SUB]0[/SUB] values started to become more plausible (although still higher than the estimates coming from authoritative sources based upon the data available at the time, mostly from the Wuhan outbreak) but then I ended up with very high rates of overall infection to get the geographical spread that was being shown at the time.

In retrospect, having talked anecdotally with so many people who had “a little cold” or “a bit of flu” in late January or February, and seeing how much the asymptomatic positive tests rise with more available testing, I’m inclined to believe that the virus has spread very widely and most infected people don’t realize that they are carrying and spreading the virus. But I wouldn’t even want to hazard a rough guess at the total number of infected people except to say that it is likely at least an order of magnitude greater than testing would indicate. We really need post-infection antibody testing to give a credible estimate at this point because many people who may have been infected have probably cleared the virus from their systems.

I don’t know how you would even go about trying to build this kind of behavior into a model. I’m sure that there are sociologists working on models of how conspiracies spread and how they affect behaviors but I don’t know how you would then tie that into some kind of compartmental model to predict the effect of behavior on infection spread on a local level. I guess you’d have to use some kind of heuristic modeling technique to capture feedbacks, but then information, misinformation, and guidance from government officials has been changing so quickly you wouldn’t even be able to have some kind of equilibrium state to compare the model predictions to (mostly non-existent) sample data. There are socio-epidemiological models on how vaccination or the lack of affects infection rates for diseases like the measles but they are post hoc models that look at the effectiveness of vaccination campaigns and public education efforts, not predictive models to trend epidemic outbreaks.

As for the possibility of reinfection, since this is a ‘novel’ virus we really don’t know how much immunity exposure conveys, but for every other infectious virus that the immune system develops a response to having been exposed provides at least significant immune response for at least a few years, and often for the patient’s lifetime. However, that response can be inadequate if the patient has a compromised immune system or the virus mutates sufficiently that the T cells no longer recognize the pathogen derived proteins associated with the virus, hence why people can present shingles from the normally dormant varicella zoster (chickenpox) virus when stressed or immunocompromised.

FiveThirtyEight: “Why It’s So Freaking Hard To Make A Good COVID-19 Model”:

*So, imagine a simple mathematical model to predict coronavirus outcomes. It’s relatively easy to put together — the sort of thing people on our staff do while buzzed on a socially isolated conference call after work. The number of people who will die is a function of how many people could become infected, how the virus spreads and how many people the virus is capable of killing.

See? Easy. But then you start trying to fill in the blanks. That’s when you discover that there isn’t a single number to plug into … anything. Every variable is dependent on a number of choices and knowledge gaps. And if every individual piece of a model is wobbly, then the model is going to have as much trouble standing on its own as a data journalist who has spent too long on a conference call while socially isolated after work.*

Stranger

Carryon · April 5, 2020, 12:33am

The numbers are only predicting what we KNOW at the the time the numbers are being calculated.

For instance, we knew about the price of gold and what it was likely to do. Then the gold was found in California and the gold rush changed everything rendering that obsolete.

Even places like South Korea, the testing isn’t that complete and things like false positives, false negative and lifestyle choices.

Add to that, we in America still are not treating symptoms but rather telling people go home and when you get sick enough go to the hospital. There are a lot of symptoms of pneumonia which can be treated early on. We’re not doing, whereas places like Germany are.

Then we have the whole quesiton of asymptomatic carriers. On the Princess Cruise, 20% of passengers that were positive showed no symptoms then and still haven’t gotten ill.

Is this a statistical anomaly or does it account for why pockets of people falling ill are seemingly springing up out of nowhere?

Bottom line is garbage in, garbage out. Your results are only good as the numbers going into it. Expect it to change as we get more information.

It’s going to be a good five years after this ends, before we get meaningful accurate analysis of what happened.

HoneyBadgerDC · April 5, 2020, 12:35am

I think they are grossly underestimating the amount of people infected but not sick enough to seek treatment. Even those who call in sick are told to shelter in place and never are tested or become data. We have no reliable data on the virus to base our decisions on. Mass testing on the general public needs to be carried out.

Stranger_On_A_Train · April 5, 2020, 12:52am

The Diamond Princess is a highly biased population sample: not only is it older, wealthy people from a country where a lot of people have specific chronic underlying health issues due to lifestyle choices (COPD, diabetes, obesity, et cetera), but it is also people who were in close contact with other infected people for a significant length of time. The data from the Diamond Princess might inform a model of how the virus spreads in that specific type of environment (and thus, might be useful for estimating contagion in the case of the USS Roosevelt) but it is really not a good representation of the US or other national populations as a whole.

South Korea is a better sample because they have done widespread testing of basically anyone who wanted to be tested, so they have a lot of data with a presumably broad demographic representation (although I would assume the people who got tested either had some symptoms or was in close contact with someone else who did, so there may still be bias), but the more significant issue is that South Korea took the sort of early steps to limit contagion like distancing and closing down public venues, while the US did not (and in many states still is not) doing these things. So the distribution in a representative sample of South Koreans tells us what the infection rate and R[SUB]0[/SUB] looks like in what is essentially the best case response, but it is essentially an unachievable lower bound per capita for what we will see in the United States, or what other countries that were slow to respond like Great Britain will look like.

What is really needed is geographically extensive sample testing of both the ‘hot spots’ and the places that are not yet hot but are going to be soon (which is basically going to be everywhere in the country), both antigen testing to see who is infected and antibody testing to see who was infected but is now immunized so you can make good estimates of how likely the remaining uninfected to be exposed.

To address the question of the o.p., we’ll have a reasonable estimate of when peak deaths will occur once the rate of hospitalizations flattens out, because there is a pretty consistent correlation between hospitalizations with critical symptoms and fatality rates with some compensation for the number of people who will die because they cannot receive critical care for lack of ventilators and medication. When new hospitalizations flatten out, peak deaths will occur 2-3 weeks later because that is about the time it take to either succumb or recover sufficiently to be removed from ICU care. That is really the best answer anyone has that doesn’t come with absurdly wide error bars.

Stranger

cmosdes · April 6, 2020, 2:10am

Stranger_On_A_Train:

FiveThirtyEight: “Why It’s So Freaking Hard To Make A Good COVID-19 Model”:

*So, imagine a simple mathematical model to predict coronavirus outcomes. It’s relatively easy to put together — the sort of thing people on our staff do while buzzed on a socially isolated conference call after work. The number of people who will die is a function of how many people could become infected, how the virus spreads and how many people the virus is capable of killing.

See? Easy. But then you start trying to fill in the blanks. That’s when you discover that there isn’t a single number to plug into … anything. Every variable is dependent on a number of choices and knowledge gaps. And if every individual piece of a model is wobbly, then the model is going to have as much trouble standing on its own as a data journalist who has spent too long on a conference call while socially isolated after work.*

Stranger

Would a Monte Carlo analysis be of use in a situation like this?

Stranger_On_A_Train · April 6, 2020, 2:36am

So, I have quite a bit of experience with Monte Carlo methods because this is exactly how you do launch vehicle stability simulations and certain types of sensitivity analysis in reliability engineering. Monte Carlo methods are useful because they allow you to test models that have far more parameters than you could compare with some kind of ANOVA analysis, particularly when you have some kind of time-varying model when the behavior in the future has a strong dependence to previous conditions, and you are testing to see how likely a violation of some stability or reliability threshold is. However, like any simulation or analysis, if you put garbage in you get garbage out. And with epidemiology, the types of models they run tend to be compartmental models that look at aggregated behavior because of the sheer amount of data and difficulty trying to represent individual incidences of transmission, so in general they really aren’t interested in trying to predict individual transmissions at a global level. (There are “track and trace” efforts to follow index patients in an outbreak but once a disease is at the epidemic level there are simply too many infections to follow.)

In the case of the SARS-CoV-2 pandemic, there is so little reliable random sample test data that trying to estimate parameters to a model is pure guesswork. The current estimate from the White House that total deaths from COVID-19 will be 100,000 to 240,000 is without apparent basis other than being the low end of the confidence interval for the current best estimates.

Washington Post: “Experts and Trump’s advisers doubt White House’s 240,000 coronavirus deaths estimate”

At a task force meeting this week, according to two officials with direct knowledge of it, Anthony S. Fauci, director of the National Institute of Allergy and Infectious Diseases, told others there are too many variables at play in the pandemic to make the models reliable: “I’ve looked at all the models. I’ve spent a lot of time on the models. They don’t tell you anything. You can’t really rely upon models.”

From Marc Lipsitch, Harvard T.H. Chan School of Public Health:

Washington Post: “Far more people in the U.S. have the coronavirus than you think. We aren’t testing enough to control the outbreak. The real count could be 10 times higher.”

Stranger

SayTwo · April 6, 2020, 3:00am

So, it’s hard to model this thing, but we are told over and over again to ‘trust the experts’ because they spend their whole lives…trying to model things that are hard to model?

cmosdes · April 6, 2020, 3:41am

Stranger_On_A_Train:

So, I have quite a bit of experience with Monte Carlo methods because this is exactly how you do launch vehicle stability simulations and certain types of sensitivity analysis in reliability engineering. Monte Carlo methods are useful because they allow you to test models that have far more parameters than you could compare with some kind of ANOVA analysis, particularly when you have some kind of time-varying model when the behavior in the future has a strong dependence to previous conditions, and you are testing to see how likely a violation of some stability or reliability threshold is. However, like any simulation or analysis, if you put garbage in you get garbage out. And with epidemiology, the types of models they run tend to be compartmental models that look at aggregated behavior because of the sheer amount of data and difficulty trying to represent individual incidences of transmission, so in general they really aren’t interested in trying to predict individual transmissions at a global level. (There are “track and trace” efforts to follow index patients in an outbreak but once a disease is at the epidemic level there are simply too many infections to follow.)

In the case of the SARS-CoV-2 pandemic, there is so little reliable random sample test data that trying to estimate parameters to a model is pure guesswork. The current estimate from the White House that total deaths from COVID-19 will be 100,000 to 240,000 is without apparent basis other than being the low end of the confidence interval for the current best estimates.

Washington Post: “Experts and Trump’s advisers doubt White House’s 240,000 coronavirus deaths estimate”

At a task force meeting this week, according to two officials with direct knowledge of it, Anthony S. Fauci, director of the National Institute of Allergy and Infectious Diseases, told others there are too many variables at play in the pandemic to make the models reliable: “I’ve looked at all the models. I’ve spent a lot of time on the models. They don’t tell you anything. You can’t really rely upon models.”

From Marc Lipsitch, Harvard T.H. Chan School of Public Health:

Washington Post: “Far more people in the U.S. have the coronavirus than you think. We aren’t testing enough to control the outbreak. The real count could be 10 times higher.”

Stranger

I work far more on the failure analysis end of things than the design end. There are overlaps. but my job doesn’t usually entail predicting fail rates, although occasionally there is some of that. In other words, I know of Monte Carlo, but nothing in detail. Thanks for the clarification. I always thought of it as a way to take a broad range of variable inputs and look to see how those variations stack up in the output. That seems to fit what his happening here, where things like R0, hospitalization rates, asymptomatic rates, etc. all make it rather difficult to accurately predict outcomes. I realize it will always be garbage in = garbage out, but sometimes the noise can cancel out.

So again, I appreciate the response. I thing I understand a little better.

Stranger_On_A_Train · April 6, 2020, 4:03am

“The experts” have empirical knowledge about what has worked or not worked in the past, what can be done to try to avert the worst case scenarios and reduce harms, and how to look for trends. No actual expert can make any good predictions or reliable models without quality data, and because of the appalling lack of testing in the US and elsewhere, there is little in the way of data to be had.

Stranger

UltraVires · April 6, 2020, 4:18am

cmosdes:

I’m assuming the models that the OP linked to were using each state as an isolated set. But you’d have to ask the people doing the data crunching by state.

Is there ANY model you know of that is taking into account travel into and out of the area? Projection rates for the UK, US and every other country probably rely on very little perturbation from outside influences.

I don’t know what the model is using to determine the peak, but my guess would be transmission rates as more are infected and recover. I just can’t see this thing going to near 0 with just social distancing, nor why there would be such a drastic difference between states otherwise.

I screwed up because I summed up the number of cases wrong and assumed the peak would happen at something near 50% immunity.

I’d like to learn if the transmission rate drops as infection/recovery rises and if that would explain the drop in the curves, not just social distancing.

I understand, but U.S. states are far different than nation-states. There are no border controls and people are free to come and go as they please. I just don’t know how you make a state by state model when people from hot spot areas like NY can just go to their hunting cabins in WV or WY and carry the virus with them unknowingly.

cmosdes · April 6, 2020, 3:27pm

I would think there is a bigger problem with border cities. Kansas City is a really good example. NYC and the NYC area border 2 other states.
Yes, there is a lot of fluidity between states, but what else are you going to do? I really don’t know how they model these things, so anything I proffered would be a guess, at best. In the end, I still think the inter-state/inter-country mixing is going to be small compared to what is already in state, in addition to the it likely averaging out to be a net zero (people going into NY might roughly match the people coming out, for example.)

Ulfreida · April 6, 2020, 4:49pm

Here are two anecdotes to give an idea of how loosey goosey all this is.

My good friend in Berkeley Ca, working in an international field at Cal, came down with “something” on March 12, immediately self-isolated, did not get tested (she would not have qualified for a test anywhere in the USA that I know of). She did not have a fever but had a terrible dry cough, lost her sense of smell and taste for several days, felt awful for a week. By that point UC had been closed down and she along with everyone else in Berkeley was stuck in her flat anyway. She is someone who will not show up on anyone’s data.
My sister-in-law in San Jose Ca, a public school aide, came down with “something” in mid March. Fever, cough. Though it was the flu. They had guests in their house, one of whom went home to Massachusetts, ill with the same thing. No test. Nothing.

Since I have a pretty small circle of family and friends, I would extrapolate to say this must be happening everywhere. People are getting somewhat sick and getting over it but no one is tracking it at all. This contributes not just to the spread – hey I LIVE in Massachusetts, thanks a pile! – but also to the flattening of the curve, as they are now, presumably, on the other side of it. Correct?

The_wind_of_my_soul · April 6, 2020, 6:25pm

Yes, that’s basically my question. I did watch the video you posted on Friday evening, and it was helpful. Your comment brings up another intriguing thought experiment, which is how the conclusion of social distancing laws will be determined. If the idea is to restrict the disease to small subpopulations, then we may be seeing social distancing laws in place well after the disease has appeared to peak and recede. And indeed, that site I linked to in the OP mentioned that the projections assumed social distancing through August.

Topic		Replies	Views
Estimating the percentage of contagious people walking around The Quarantine Zone covid-quarantine	19	1730	March 29, 2020
why are deaths dropping but infection rate stabilizing? The Quarantine Zone covid-quarantine	147	4622	July 19, 2020
Predict the number dead from COVID-19 in America by end of 2020 The Quarantine Zone covid-quarantine	312	21211	December 2, 2021
I don't understand the numbers/charts/graphs of COVID-19 The Quarantine Zone covid-quarantine	20	1390	April 28, 2020
10 times as many cases as reported, is this accurate? The Quarantine Zone covid-quarantine	75	2224	August 8, 2020

Can anyone explain the math to me?

Related topics