thesis

Bayesian inference and model selection for partially observed stochastic epidemics.

Abstract

Over the past decades, statistical models have been established as an important tool for understanding the transmission dynamics of infectious diseases. Inference in such models can be challenging due to the strong dependencies in the actual epidemic process, as well as the fact that observations often rely in diagnostic tests that have imperfect sensitivities. Moreover, samples are often taken with very low temporal resolution, which leads to the actual dynamics being only partially observed. Data augmentation techniques implemented within the framework of Markov chain Monte Carlo (MCMC) methods can tackle these problems by taking into account the unobserved dynamics of transmission and thus have been widely employed in practice. Despite the methodological advances in the context of partially observed epidemic models, there are still several open challenges that remain to be addressed. One of the key challenges is the establishment of model comparison techniques that can be efficiently applied in problems involving a large amount of missing information. In this thesis, we describe a framework based on importance sampling which provides estimates of the marginal likelihood and is well suited for applications in this complex setting. Until recently, the study of infectious diseases in large scale populations has been challenging due to the computationally intensive methods needed to these models. One further contribution of this thesis is the development of a data augmentation MCMC algorithm that can be used in both Markovian and non-Markovian epidemic models. Our algorithm achieves good computational efficiency and therefore can be viewed as an alternative to existing approaches, particularly for applications on big datasets. The last part of the thesis is concerned with epidemic data containing additional information regarding the strain of a pathogen with which individuals are infected. Quantifying the interactions between the different strains of pathogens is crucial in order to obtain a complete understanding of the disease but statistical methods for this type of problem are still in the early stages of development. Motivated by this demand, we construct a model that incorporates this additional information and propose a statistical algorithm for inference. The model improves upon existing methods in the sense that it allows for both imperfect diagnostic test sensitivities and strain misclassification. Finally, extensive simulation studies are conducted in order to assess the performance of our methods, while the utility of the developed methodologies is demonstrated on data obtained from two longitudinal studies of Escherichia coli in cattle

    Similar works