Replies: 5 comments 7 replies
-
Hey Nick. Fully support the idea of adding uncertainty to the model, but my belief is that the stochasticity of the model is potentially more important than parameter variations. I saw that you removed the poisson sampling based code from the codebase and closed #182, which I was a bit disappointed by. I welcome you to model it yourself, but there is ample evidence that modeling the heterogeneity and stochasticity of networks leads to completely different emergent behavior than the deterministic, homogenous variants. I've done a bit of lit review here: https://github.com/understand-covid/proposal/blob/master/model/Prior%20Art.md If you end up agreeing, my argument is that adding parameter uncertainty actually leads to the potential to use the model more irresponsibly (purporting that something is a worst case scenario, for example). I'd urge that if you do end up implementing it, some additional language should be added to the modal with the disclaimers. If you end up going forward with some sort of sensitivity analysis, I think the bigger implementation problem is interactions between variables; you'll likely have to start doing MCMC, stochastic calculus, or approaching from a bayesian perspective, all of which will be computationally expensive in the browser. From a UI implementation perspective, I think a toggle that changes each parameter to have two inputs (for a min and max) would be reasonable UX. |
Beta Was this translation helpful? Give feedback.
-
Hey John, I fully agree that hetereogenity of the underlying dynamics is an important effect that the model completely ignores right now. However, if one wants to capture the correct effects of stochasticity here, adding a Langevin term (i.e. Poisson sampling) isn't sufficient and will inaccurately capture the statistics of the stochastic process you are trying to simulate. Things one needs to worry about at this level of granularity are spatial processes, social structure, etc. Parameterization of everything becomes difficult so we opted, for the time being, for an ODE approach that hopefully approximates the mean of such a process. As you allude to in your comment, to do this properly requires either an MCMC or an agent-based model. I would love to, in the future, have this type of model running in the background in a back-end for users. However, within the context of the the model as it is now, I don't think adding Poisson error bars is capturing at all the relevant stochastic dynamics. So think of this as a future implementation as time permits. |
Beta Was this translation helpful? Give feedback.
-
Nick, Thanks for the reply. Apologies that I didn't communicate clearly - I guess that is the danger of very late night posts. I was not trying to suggest that using a poisson prior was the right approach, but instead trying to suggest that moving in the direction of treating things as a SDE was the right idea. I think that Poisson priors are likely bad prior, given that most of the empirical data has different mean and variance. However, let's do a thought experiment for a moment. If you agree that a SDE is the more accurate way of modeling the phenomena at hand, how do we define the current formulation as an SDE? The answer lies pretty clearly in the codebase already - when you removed the poisson sampling, you left the While I don't like the idea of doing Ito calculus over breakfast, the truth is that almost every system that we model is more appropriately modeled with a set of SDEs (and in most non-trivial systems, those happen to be stochastic convolutional equations). ODEs can be seen simply as a more compact way of representing SDEs with extremely restrictive priors on the random variables - a delta function - so as to free up computational resources. In places where variance is very low with respect to the mean, this can be a reasonable approximation. In places where that is not true, my belief is that relaxing to even a weakly informative prior is always an improvement. That being said, my understanding of your proposal is that you would add stochasticity to the initial conditions, and then keep those variables fixed within each run - essentially you'd be changing the delta function out for another distribution at time t0 only. If the belief is that the variables are in fact near-zero variance - but we simply don't know the exact value - then this is a reasonable way of modeling things. But it sounds like you (like me) believe that social structure and individual level variation is crucial to understanding the emergent dynamics of this highly heterogenous system. I believe that the injection of stochasticity at t0 to the input parameters is certainly an improvement on the model. However, my concern is that adding error bars is potentially misleading; someone could interpret it as a fully stochastic model and interpret that the upper bound is actually a 'worst case' scenario, when stochasticity of network dynamics may completely change the bounds. At least with the deterministic ODE solution, that interpretation is not possible (though it has plenty of opportunity to be misunderstood in other ways). Even more concerning is the fact that SDEs often have different emergent dynamics than their ODE counterparts, so even shape of the curve could be radically different, not just the bounds. Lets take a more concrete example of how 'fat tails' emerge in reality. A 'hub' based network graph, where a few nodes have extremely high connectivity, but then their child nodes (that are not also hubs) have extremely low degree but a long serial chain of connections, will have very different dynamics than a homogenous network in terms of how an infection (especially one with high latency) will spread. In the firewall world, this is why most network topologies have moved to a multi-tier model. In the epidemiological virus world, this is why viruses often take longer to reach rural areas, but can spread quickly within those rural areas once the virus makes its way there because of the high degree within that subgraph. This is the reason why, with COVID-19, one of the first enacted policies was to ban large gatherings. Changing the network topology through policy dramatically changes the system dynamics. As such, my belief is that investing in moving into the SDE world (with at least weakly informative priors and then moving toward the empirical distributions as data becomes available) is a worthwhile investment even in the short term. I understand, however, that it may not be in the scope of this project and that there may be higher priorities on your end. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
So, given that the model now gives upper and lower bounds, can you describe what they represent? |
Beta Was this translation helpful? Give feedback.
-
Hey all,
This is a topic that needs to be addressed ASAP. As our model is starting to see more usage, the fully deterministic approach to forecasting is inadequate at best. We need to start forecasting a range of scenarios given distributions of input parameters and reporting the mean and standard deviation of the results (or confidence intervals). While there is also variance associated to fundamental stochasticity of the dynamics, I think this is less important for now.
We need a way to input our degree of certainty for a given input - i.e. R0 is known to be 3.2 with a 5% error on it. This is a potential UI nightmare if every parameter has another box associated with an error bar. Can we brainstorm a better way that this is input by the user -> Change to a drag bar for an interval of possible values? I view this as an incredibly important next step for the website.
The alternative is we don't allow error/uncertainty to be toggled by the user and just put error bars on the parameters ourselves to report confidence intervals.
Beta Was this translation helpful? Give feedback.
All reactions