Why it’s so hard to make a good Covid-19 model
What this article misses is that simple models of complex systems in science are most useful for understanding the dynamics of phenomena, not for making accurate quantitative predictions. This is not a model of a mass accelerating in a vacuum where Newton's laws are sufficient to a high degree of accuracy, or even a numerical model of the aerodynamics of an airplane, where the physics are well understood but there are far too many particles to solve analytically. This is a parameterized model for the behavior of millions of people, each one's reaction to exposure to viral particles, legal and social norms, personal and economic situations, and so on. Of course we are not going to be able to predict the future of an unprecedented event like this quantitatively. It's not just that data are hard to obtain accurately; the models themselves are so simplified that they can't capture much of the important dynamics going on.
Its pretty simple: there isn't any decent data.
The data we have is strongly biased to older and sicker people.
There is no systematic surveillance of a geographic area, only panic testing of those who are showing symptoms.
Until there is sampling of a a borough, city or town, from start to finish, we will have wildly wrong models.
The only thing that we can plot reasonably accurately is the exponent of the fatalities, but even then its because its based on mostly hard data (unless its china...)
Note that this article is from March 31st, and while that wouldn't normally be very long ago things are moving extremely quickly.
The article is a long listing of ways to say GIGO. It's hard to get good clean accurate data.
In jest, I am imagining a bunch of data scientists working late into the night, running the numbers and various scenarios, and then one person finally stands up and says, "FK it, let's just shut down the whole country and hope for the best."
> Why it’s so hard to make a good Covid-19 model
Because there's no downside to the authors for overpredicting deaths and resource use, and _a lot_ of downside for underpredicting. So all the "bad" things get taken into account, and all the "good" things are ignored.
One thing I've reinforced in my view of the world is that common sense is very uncommon indeed. "2 million deaths" my ass. I hope people reconsider their trust in other models that are "hard to make".
Good luck trying to predict how much of a population is going to refuse to isolate and interact anyway.
Also relies on governments and politicians to not fudge testing and death counts, which is never going to be accurate.
btw the financial times has maybe the best graph on the stats however flawed:
http://com.ft.imagepublish.upp-prod-us.s3.amazonaws.com/2251...
I've found this topic pretty interesting, and I've enjoyed trying my hand at it myself.
One of the things I've been playing with is Insight Maker https://insightmaker.com/ This site is a totally free platform where you can set up the kinds of simulations this article describes (stock and flow models). You can even specify your uncertainty in your baseline assumptions and run sensitivity analysis to see what the relative impact of each factor is on the model, and the range of potential outputs you could have. This system is very much like https://www.getguesstimate.com/, except much more flexible and way less intuitive.
Insight maker isn't a professional tool, it's really more of an advocacy and outreach platform, but despite that it's really quite powerful.
I think after you've read this article and internalized the difficulties in modeling pandemics (and have re-affirmed to yourself that you are not an epidemiologist, unless you are, more power to you if so), you might have some fun trying to build the model this article describes.
undefined
I'm still curious about the relationship between R0 and doubling period, because you can kind of see a relationship between the two by examining contagion period.
In general we've seen doubling periods that seem to suggest 5-6 days, but occasionally as fast as 3 days, unclear how distorted those numbers are by testing and mitigation and misattributed deaths.
If it has a natural R0 of 6, then it means that an infection will infect 6 others within that contagion period.
If we're thinking R0 is not 2-3 but is instead 5-6, then to make consistent with the doubling periods we are seeing, it'd mean that people are contagious for a longer period of time than we first thought.
I tried to do some exploration with the data to get some idea on the final number. One of the best plot I found that could tell the final number is plotting the percentage of cases in the next week with the cases till now. It is like negative half parabola and the time it meets the x axis will give the final number per country.
This is the final plot: https://i.imgur.com/5p4Xife.png. You could make a good guess from the data how many people it will affect in the lifetime per country by continuing the same pattern till it reaches x axis.
I really like the neher lab prediction tool.
You can choose different scenarios and compare it with real data. So you can match the parameters of your model to the actual measured data (number of deaths) and it also has data about the population and the state of ICUs and hospitals that you can use in your prediction.
In the end it is just a tool which can give completely wrong predictions, but you get a felling for what could and could not happen.
The results of lockdown in Spain and Italy are not in this post and they provide crucial information. The lockdown in Spain has reduced in two weeks the daily infection rate from 42% to 4%, so there is hope for the future. But the problem is that the economy can't cope with the lockdown so we have a big problem. Also since the R0 can be reduced so much with political decisions, the emphasis should be on the political and economical ground.
Garbage in, garbage out! Can't build good models with data that's terrible. Inconsistent testing, different reporting methods, lying from governments.
Completely tangent observation: What's the point of using sketch scribbles over the diagrams? They could just make it in powerpoint and simplify. It would be easier to read as well. Decoration for the sake of decoration? Why?
Don’t all these models ignore the effects of warmer weather? A few studies have suggested there’s at least some effect.
We know it's hard Nate, it's hard for everyone else too.
That being said, I'd pay good money to hear his latest numbers...
It's novelty. Basically all models are based on assumptions in closed system. As one gets more information they add it to the closed system eventually making it easier to predict. As we know reality is extremely heterogeneous and an open system with too many interactions. This in principle makes all practically wrong. But it does not mean that they are not useful. (At least they are useful to make scary graphs that convince presidents).
I had a lame model that is currently predicting lower death rate but in a similar trend [1]. In this model my assumption is the lockdown to continue for 45 days. The result which I regret to have seen shows a scary number of 600k deaths after 39 days. So I'm hoping for something spectacular to happen such as vaccine, a drug, the sun etc that I would use to change my prediction.
undefined
Because people have unrealistic expectations.
Best article I've read through this entire Pandemic.
What about our models for global warming?
Very odd little diagrams to show model components in this article - why don't they use digraphs like everyone who knows how to do stats?
oh...
Who thought it would be easy. One person can decide to break quarantine and infect other people.