5 research outputs found

    Particle filtering in compartmental projection models

    Get PDF
    Simulation models are important tools for real-time forecasting of pandemics. Models help health decision makers examine interventions and secure strong guidance when anticipating outbreak evolution. However, models usually diverge from the real observations. Stochastics involved in pandemic systems, such as changes in human contact patterns play a substantial role in disease transmissions and are not usually captured in traditional dynamic models. In addition, models of emerging diseases face the challenge of limited epidemiological knowledge about the natural history of disease. Even when the information about natural history is available -- for example for endemic seasonal diseases -- transmission models are often simplified and are involved with omissions. Availability of data streams can provide a view of early days of a pandemic, but fail to predict how the pandemic will evolve. Recent developments of computational statistics algorithms such as Sequential Monte Carlo and Markov Chain Monte Carlo, provide the possibility of creating models based on historical data as well as re-grounding models based on ongoing data observations. The objective of this thesis is to combine particle filtering -- a Sequential Monte Carlo algorithm -- with system dynamics models of pandemics. We developed particle filtering models that can recurrently be re-grounded as new observations become available. To this end, we also examined the effectiveness of this arrangement which is subject to specifics of the configuration (e.g., frequency of data sampling). While clinically-diagnosed cases are valuable incoming data stream during an outbreak, new generation of geo-spatially specific data sources, such as search volumes can work as a complementary data resource to clinical data. As another contribution, we used particle filtering in a model which can be re-grounded based on both clinical and search volume data. Our results indicate that the particle filtering in combination with compartmental models provides accurate projection systems for the estimation of model states and also model parameters (particularly compared to traditional calibration methodologies and in the context of emerging communicable diseases). The results also suggest that more frequent sampling from clinical data improves predictive accuracy outstandingly. The results also present that assumptions to make regarding the parameters associated with the particle filtering itself and changes in contact rate were robust across adequacy of empirical data since the beginning of the outbreak and inter-observation interval. The results also support the use of data from Google search API along with clinical data

    Architectures and GPU-Based Parallelization for Online Bayesian Computational Statistics and Dynamic Modeling

    Get PDF
    Recent work demonstrates that coupling Bayesian computational statistics methods with dynamic models can facilitate the analysis of complex systems associated with diverse time series, including those involving social and behavioural dynamics. Particle Markov Chain Monte Carlo (PMCMC) methods constitute a particularly powerful class of Bayesian methods combining aspects of batch Markov Chain Monte Carlo (MCMC) and the sequential Monte Carlo method of Particle Filtering (PF). PMCMC can flexibly combine theory-capturing dynamic models with diverse empirical data. Online machine learning is a subcategory of machine learning algorithms characterized by sequential, incremental execution as new data arrives, which can give updated results and predictions with growing sequences of available incoming data. While many machine learning and statistical methods are adapted to online algorithms, PMCMC is one example of the many methods whose compatibility with and adaption to online learning remains unclear. In this thesis, I proposed a data-streaming solution supporting PF and PMCMC methods with dynamic epidemiological models and demonstrated several successful applications. By constructing an automated, easy-to-use streaming system, analytic applications and simulation models gain access to arriving real-time data to shorten the time gap between data and resulting model-supported insight. The well-defined architecture design emerging from the thesis would substantially expand traditional simulation models' potential by allowing such models to be offered as continually updated services. Contingent on sufficiently fast execution time, simulation models within this framework can consume the incoming empirical data in real-time and generate informative predictions on an ongoing basis as new data points arrive. In a second line of work, I investigated the platform's flexibility and capability by extending this system to support the use of a powerful class of PMCMC algorithms with dynamic models while ameliorating such algorithms' traditionally stiff performance limitations. Specifically, this work designed and implemented a GPU-enabled parallel version of a PMCMC method with dynamic simulation models. The resulting codebase readily has enabled researchers to adapt their models to the state-of-art statistical inference methods, and ensure that the computation-heavy PMCMC method can perform significant sampling between the successive arrival of each new data point. Investigating this method's impact with several realistic PMCMC application examples showed that GPU-based acceleration allows for up to 160x speedup compared to a corresponding CPU-based version not exploiting parallelism. The GPU accelerated PMCMC and the streaming processing system can complement each other, jointly providing researchers with a powerful toolset to greatly accelerate learning and securing additional insight from the high-velocity data increasingly prevalent within social and behavioural spheres. The design philosophy applied supported a platform with broad generalizability and potential for ready future extensions. The thesis discusses common barriers and difficulties in designing and implementing such systems and offers solutions to solve or mitigate them

    Incorporating Particle Filtering and System Dynamic Modelling in Infection Transmission of Measles and Pertussis

    Get PDF
    Childhood viral and bacterial infections remain an important public problem, and research into their dynamics has broader scientific implications for understanding both dynamical systems and associated methodologies at the population level. Measles and pertussis are two important childhood infectious diseases. Measles is a highly transmissible disease and is one of the leading causes of death among young children under 5 globally. Pertussis (whooping cough) is another common childhood infectious disease, which is most harmful for babies and young children and can be deadly. While the use of ongoing surveillance data and - recently - dynamic models offer insight on measles (or pertussis) dynamics, both suffer notable shortcomings when applied to measles (or pertussis) outbreak prediction. In this thesis, I apply the Sequential Monte Carlo approach of particle filtering, incorporating reported measles and pertussis incidence for Saskatchewan during the pre-vaccination era, using an adaptation of a previously contributed measles and pertussis compartmental models. To secure further insight, I also perform particle filtering on age structured adaptations of the models. For some models, I further consider two different methods of configuring the contact matrix. The results indicate that, when used with a suitable dynamic model, particle filtering can offer high predictive capacity for measles and pertussis dynamics and outbreak occurrence in a low vaccination context. Based on the most competitive model as evaluated by predictive accuracy, I have performed prediction and outbreak classification analysis. The prediction results demonstrated that the most competitive models could predict the measles and pertussis outbreak patterns and classify whether there will be an outbreak or not in the next month (Area under the ROC Curve of measles is 0.89, while pertussis is 0.91). I conclude that anticipating the outbreak dynamics of measles and pertussis in low vaccination regions by applying particle filtering with simple measles and pertussis transmission models, and incorporating time series of reported case counts, is a valuable technique to assist public health authorities in estimating risk and magnitude of measles and pertussis outbreaks. Such approach offers particularly strong value proposition for other pathogens with little-known dynamics, important latent drivers, and in the context of the growing number of high-velocity electronic data sources. Strong additional benefits are also likely to be realized from extending the application of this technique to highly vaccinated populations

    Transmission Modeling with Smartphone-based Sensing

    Get PDF
    Infectious disease spread is difficult to accurately measure and model. Even for well-studied pathogens, uncertainties remain regarding the dynamics of mixing behavior and how to balance simulation-generated estimates with empirical data. Smartphone-based sensing data promises the availability of inferred proximate contacts, with which we can improve transmission models. This dissertation addresses the problem of informing transmission models with proximity contact data by breaking it down into three sub-questions. Firstly, can proximity contact data inform transmission models? To this question, an extended-Kalman-filter enhanced System Dynamics Susceptible-Infectious-Removed (EKF-SD-SIR) model demonstrated the filtering approach, as a framework, for informing Systems Dynamics models with proximity contact data. This combination results in recurrently-regrounded system status as empirical data arrive throughout disease transmission simulations---simultaneously considering empirical data accuracy, growing simulation error between measurements, and supporting estimation of changing model parameters. However, as revealed by this investigation, this filtering approach is limited by the quality and reliability of sensing-informed proximate contacts, which leads to the dissertation's second and third questions---investigating the impact of temporal and spatial resolution on sensing inferred proximity contact data for transmission models. GPS co-location and Bluetooth beaconing are two of those common measurement modalities to sense proximity contacts with different underlying technologies and tradeoffs. However, both measurement modalities have shortcomings and are prone to false positives or negatives when used to detect proximate contacts because unmeasured environmental influences bias the data. Will differences in sensing modalities impact transmission models informed by proximity contact data? The second part of this dissertation compares GPS- and Bluetooth-inferred proximate contacts by accessing their impact on simulated attack rates in corresponding proximate-contact-informed agent-based Susceptible-Exposed-Infectious-Recovered (ABM-SEIR) models of four distinct contagious diseases. Results show that the inferred proximate contacts resulting from these two measurement modalities are different and give rise to significantly different attack rates across multiple data collections and pathogens. While the advent of commodity mobile devices has eased the collection of proximity contact data, battery capacity and associated costs impose tradeoffs between the frequency and scanning duration used for proximate-contact detection. The choice of a balanced sensing regime involves specifying temporal resolutions and interpreting sensing data---depending on circumstances such as the characteristics of a particular pathogen, accompanying disease, and underlying population. How will the temporal resolution of sensing impact transmission models informed by proximity contact data? Furthermore, how will circumstances alter the impact of temporal resolution? The third part of this dissertation investigates the impacts of sensing regimes on findings from two sampling methods of sensing at widely varying inter-observation intervals by synthetically downsampling proximity contact data from five contact network studies---with each of these five studies measuring participant-participant contact every 5 minutes for durations of four or more weeks. The impact of downsampling is evaluated through ABM-SEIR simulations from both population- and individual-level for 12 distinct contagious diseases and associated variants of concern. Studies in this part find that for epidemiological models employing proximity contact data, both the observation paradigms and the inter-observation interval configured to collect proximity contact data exert impacts on the simulation results. Moreover, the impact is subject to the population characteristics and pathogen infectiousness reflective (such as the basic reproduction number, R0R_0). By comparing the performance of two sampling methods of sensing, we found that in most cases, periodically observing for a certain duration can collect proximity contact data that allows agent-based models to produce a reasonable estimation of the attack rate. However, higher-resolution data are preferred for modeling individual infection risk. Findings from this part of the dissertation represent a step towards providing the empirical basis for guidelines to inform data collection that is at once efficient and effective. This dissertation addresses the problem of informing transmission models with proximity contact data in three steps. Firstly, the demonstration of an EKF-SD-SIR model suggests that the filtering approach could improve System Dynamics transmission models by leveraging proximity contact data. In addition, experiments with the EKF-SD-SIR model also revealed that the filtering approach is constrained by the limited quality and reliability of sensing-data-inferred proximate contacts. The following two parts of this dissertation investigate spatial-temporal factors that could impact the quality and reliability of sensor-collected proximity contact data. In the second step, the impact of spatial resolution is illustrated by differences between two typical sensing modalities---Bluetooth beaconing versus GPS co-location. Experiments show that, in general, proximity contact data collected with Bluetooth beaconing lead to transmission models with results different from those driven by proximity contact data collected with GPS co-location. Awareness of the differences between sensing modalities can aid researchers in incorporating proximity contact data into transmission models. Finally, in the third step, the impact of temporal resolution is elucidated by investigating the differences between results of transmission models led by proximity contact data collected with varying observation frequencies. These differences led by varying observation frequencies are evaluated under circumstances with alternative assumptions regarding sampling method, disease/pathogen type, and the underlying population. Experiments show that the impact of sensing regimes is influenced by the type of diseases/pathogens and underlying population, while sampling once in a while can be a decent choice across all situations. This dissertation demonstrated the value of a filtering approach to enhance transmission models with sensor-collected proximity contact data, as well as explored spatial-temporal factors that will impact the accuracy and reliability of sensor-collected proximity contact data. Furthermore, this dissertation suggested guidance for future sensor-based proximity contact data collection and highlighted needs and opportunities for further research on sensing-inferred proximity contact data for transmission models

    A spatial model with vaccinations for COVID-19 in South Africa

    Get PDF
    Mini Dissertation (MSc (Advanced Data Analytics))--University of Pretoria, 2023.Rarely has the world undertaken a public health effort equal in scale or scope to the one it faced in response to the COVID-19 pandemic. Countries around the world implemented government interventions such as lockdowns and national vaccination campaigns to gain control of the COVID-19 pandemic that tore across the globe in 2020. Despite best effort and determination, thousands within the global population continued to suffer and die from COVID-19 day after day. Nevertheless, much was learned about designing mass vaccination plans and implementing mass vaccination roll-outs throughout the world. When analysing cause and effect of the pandemic and when proposing intervention and prevention mechanisms to counter the pandemic, analysts in the health sector often apply mathematical models. Within the context described above, the main objective is to improve on the previously published spatial SEIR model for South Africa by including a compartment for spatial vaccination. The study further aims to assess validity, reliability and accuracy of the new model, given a socially heterogeneous and mobile population. The conclusion of this study is that the proposed model shows promising results in predicting the number of cases as well as the peak point and longevity of the wave. The study further concludes that factors such as immunity, lockdown levels, infectiousness and virulence are the main drivers of the spread of COVID-19.StatisticsMSc (Advanced Data Analytics)Unrestricte
    corecore