792 research outputs found
VAE: Encoding stochastic process priors with variational autoencoders
Stochastic processes provide a mathematically elegant way model complex data.
In theory, they provide flexible priors over function classes that can encode a
wide range of interesting assumptions. In practice, however, efficient
inference by optimisation or marginalisation is difficult, a problem further
exacerbated with big data and high dimensional input spaces. We propose a novel
variational autoencoder (VAE) called the prior encoding variational autoencoder
(VAE). The VAE is finitely exchangeable and Kolmogorov consistent,
and thus is a continuous stochastic process. We use VAE to learn low
dimensional embeddings of function classes. We show that our framework can
accurately learn expressive function classes such as Gaussian processes, but
also properties of functions to enable statistical inference (such as the
integral of a log Gaussian process). For popular tasks, such as spatial
interpolation, VAE achieves state-of-the-art performance both in terms of
accuracy and computational efficiency. Perhaps most usefully, we demonstrate
that the low dimensional independently distributed latent space representation
learnt provides an elegant and scalable means of performing Bayesian inference
for stochastic processes within probabilistic programming languages such as
Stan
Associated patterns of insecticide resistance in field populations of malaria vectors across Africa.
The development of insecticide resistance in African malaria vectors threatens the continued efficacy of important vector control methods that rely on a limited set of insecticides. To understand the operational significance of resistance we require quantitative information about levels of resistance in field populations to the suite of vector control insecticides. Estimation of resistance is complicated by the sparsity of observations in field populations, variation in resistance over time and space at local and regional scales, and cross-resistance between different insecticide types. Using observations of the prevalence of resistance in mosquito species from the complex sampled from 1,183 locations throughout Africa, we applied Bayesian geostatistical models to quantify patterns of covariation in resistance phenotypes across different insecticides. For resistance to the three pyrethroids tested, deltamethrin, permethrin, and λ-cyhalothrin, we found consistent forms of covariation across sub-Saharan Africa and covariation between resistance to these pyrethroids and resistance to DDT. We found no evidence of resistance interactions between carbamate and organophosphate insecticides or between these insecticides and those from other classes. For pyrethroids and DDT we found significant associations between predicted mean resistance and the observed frequency of mutations in the gene in field mosquito samples, with DDT showing the strongest association. These results improve our capacity to understand and predict resistance patterns throughout Africa and can guide the development of monitoring strategies
Modelling adult Aedes aegypti and Aedes albopictus survival at different temperatures in laboratory and field settings.
BACKGROUND: The survival of adult female Aedes mosquitoes is a critical component of their ability to transmit pathogens such as dengue viruses. One of the principal determinants of Aedes survival is temperature, which has been associated with seasonal changes in Aedes populations and limits their geographical distribution. The effects of temperature and other sources of mortality have been studied in the field, often via mark-release-recapture experiments, and under controlled conditions in the laboratory. Survival results differ and reconciling predictions between the two settings has been hindered by variable measurements from different experimental protocols, lack of precision in measuring survival of free-ranging mosquitoes, and uncertainty about the role of age-dependent mortality in the field. METHODS: Here we apply generalised additive models to data from 351 published adult Ae. aegypti and Ae. albopictus survival experiments in the laboratory to create survival models for each species across their range of viable temperatures. These models are then adjusted to estimate survival at different temperatures in the field using data from 59 Ae. aegypti and Ae. albopictus field survivorship experiments. The uncertainty at each stage of the modelling process is propagated through to provide confidence intervals around our predictions. RESULTS: Our results indicate that adult Ae. albopictus has higher survival than Ae. aegypti in the laboratory and field, however, Ae. aegypti can tolerate a wider range of temperatures. A full breakdown of survival by age and temperature is given for both species. The differences between laboratory and field models also give insight into the relative contributions to mortality from temperature, other environmental factors, and senescence and over what ranges these factors can be important. CONCLUSIONS: Our results support the importance of producing site-specific mosquito survival estimates. By including fluctuating temperature regimes, our models provide insight into seasonal patterns of Ae. aegypti and Ae. albopictus population dynamics that may be relevant to seasonal changes in dengue virus transmission. Our models can be integrated with Aedes and dengue modelling efforts to guide and evaluate vector control, better map the distribution of disease and produce early warning systems for dengue epidemics
Bayesian inference of phylogenetic distances: revisiting the eigenvalue approach
Using genetic data to infer evolutionary distances between molecular sequence pairs based on a Markov substitution model is a common procedure in phylogenetics, in particular for selecting a good starting tree to improve upon. Many evolutionary patterns can be accurately modelled using substitution models that are available in closed form, including the popular general time reversible model (GTR) for DNA data. For more complex biological phenomena, such as variations in lineage-specific evolutionary rates over time (heterotachy), other approaches such as the GTR with rate variation (GTR +Γ ) are required, but do not admit analytical solutions and do not automatically allow for likelihood calculations crucial for Bayesian analysis. In this paper, we derive a hybrid approach between these two methods, incorporating Γ(α,α) -distributed rate variation and heterotachy into a hierarchical Bayesian GTR-style framework. Our approach is differentiable and amenable to both stochastic gradient descent for optimisation and Hamiltonian Markov chain Monte Carlo for Bayesian inference. We show the utility of our approach by studying hypotheses regarding the origins of the eukaryotic cell within the context of a universal tree of life and find evidence for a two-domain theory
Deep learning and MCMC with aggVAE for shifting administrative boundaries:mapping malaria prevalence in Kenya
Model-based disease mapping remains a fundamental policy-informing tool in the fields of public health and disease surveillance. Hierarchical Bayesian models have emerged as the state-of-the-art approach for disease mapping since they are able to both capture structure in the data and robustly characterise uncertainty. When working with areal data, e.g. aggregates at the administrative unit level such as district or province, current models rely on the adjacency structure of areal units to account for spatial correlations and perform shrinkage. The goal of disease surveillance systems is to track disease outcomes over time. This task is especially challenging in crisis situations which often lead to redrawn administrative boundaries, meaning that data collected before and after the crisis are no longer directly comparable. Moreover, the adjacency-based approach ignores the continuous nature of spatial processes and cannot solve the change-of-support problem, i.e. when estimates are required to be produced at different administrative levels or levels of aggregation. We present a novel, practical, and easy to implement solution to solve these problems relying on a methodology combining deep generative modelling and fully Bayesian inference: we build on the recently proposed PriorVAE method able to encode spatial priors over small areas with variational autoencoders by encoding aggregates over administrative units. We map malaria prevalence in Kenya, a country in which administrative boundaries changed in 2010.</p
Bhatt, Ferguson, Flaxman, Gandy, Mishra, and Scott's reply to the discussion of ‘The Second Discussion Meeting on Statistical aspects of the Covid-19 Pandemic’
Seq2Seq Surrogates of Epidemic Models to Facilitate Bayesian Inference
Epidemic models are powerful tools in understanding infectious disease.
However, as they increase in size and complexity, they can quickly become
computationally intractable. Recent progress in modelling methodology has shown
that surrogate models can be used to emulate complex epidemic models with a
high-dimensional parameter space. We show that deep sequence-to-sequence
(seq2seq) models can serve as accurate surrogates for complex epidemic models
with sequence based model parameters, effectively replicating seasonal and
long-term transmission dynamics. Once trained, our surrogate can predict
scenarios a several thousand times faster than the original model, making them
ideal for policy exploration. We demonstrate that replacing a traditional
epidemic model with a learned simulator facilitates robust Bayesian inference
PriorVAE: Encoding spatial priors with VAEs for small-area estimation
Gaussian processes (GPs), implemented through multivariate Gaussian
distributions for a finite collection of data, are the most popular approach in
small-area spatial statistical modelling. In this context they are used to
encode correlation structures over space and can generalise well in
interpolation tasks. Despite their flexibility, off-the-shelf GPs present
serious computational challenges which limit their scalability and practical
usefulness in applied settings. Here, we propose a novel, deep generative
modelling approach to tackle this challenge, termed PriorVAE: for a particular
spatial setting, we approximate a class of GP priors through prior sampling and
subsequent fitting of a variational autoencoder (VAE). Given a trained VAE, the
resultant decoder allows spatial inference to become incredibly efficient due
to the low dimensional, independently distributed latent Gaussian space
representation of the VAE. Once trained, inference using the VAE decoder
replaces the GP within a Bayesian sampling framework. This approach provides
tractable and easy-to-implement means of approximately encoding spatial priors
and facilitates efficient statistical inference. We demonstrate the utility of
our VAE two stage approach on Bayesian, small-area estimation tasks
A comparison of short-term probabilistic forecasts for the incidence of COVID-19 using mechanistic and statistical time series models
- …
