106,884 research outputs found
Bayesian neural networks via MCMC: a Python-based tutorial
Bayesian inference provides a methodology for parameter estimation and
uncertainty quantification in machine learning and deep learning methods.
Variational inference and Markov Chain Monte-Carlo (MCMC) sampling techniques
are used to implement Bayesian inference. In the past three decades, MCMC
methods have faced a number of challenges in being adapted to larger models
(such as in deep learning) and big data problems. Advanced proposals that
incorporate gradients, such as a Langevin proposal distribution, provide a
means to address some of the limitations of MCMC sampling for Bayesian neural
networks. Furthermore, MCMC methods have typically been constrained to use by
statisticians and are still not prominent among deep learning researchers. We
present a tutorial for MCMC methods that covers simple Bayesian linear and
logistic models, and Bayesian neural networks. The aim of this tutorial is to
bridge the gap between theory and implementation via coding, given a general
sparsity of libraries and tutorials to this end. This tutorial provides code in
Python with data and instructions that enable their use and extension. We
provide results for some benchmark problems showing the strengths and
weaknesses of implementing the respective Bayesian models via MCMC. We
highlight the challenges in sampling multi-modal posterior distributions in
particular for the case of Bayesian neural networks, and the need for further
improvement of convergence diagnosis
Partially Exchangeable Networks and Architectures for Learning Summary Statistics in Approximate Bayesian Computation
We present a novel family of deep neural architectures, named partially
exchangeable networks (PENs) that leverage probabilistic symmetries. By design,
PENs are invariant to block-switch transformations, which characterize the
partial exchangeability properties of conditionally Markovian processes.
Moreover, we show that any block-switch invariant function has a PEN-like
representation. The DeepSets architecture is a special case of PEN and we can
therefore also target fully exchangeable data. We employ PENs to learn summary
statistics in approximate Bayesian computation (ABC). When comparing PENs to
previous deep learning methods for learning summary statistics, our results are
highly competitive, both considering time series and static models. Indeed,
PENs provide more reliable posterior samples even when using less training
data.Comment: Forthcoming on the Proceedings of ICML 2019. New comparisons with
several different networks. We now use the Wasserstein distance to produce
comparisons. Code available on GitHub. 16 pages, 5 figures, 21 table
Simulation-based Inference : From Approximate Bayesian Computation and Particle Methods to Neural Density Estimation
This doctoral thesis in computational statistics utilizes both Monte Carlo methods(approximate Bayesian computation and sequential Monte Carlo) and machine-learning methods (deep learning and normalizing flows) to develop novel algorithms for inference in implicit Bayesian models. Implicit models are those for which calculating the likelihood function is very challenging (and often impossible), but model simulation is feasible. The inference methods developed in the thesis are simulation-based inference methods since they leverage the possibility to simulate data from the implicit models. Several approaches are considered in the thesis: Paper II and IV focus on classical methods (sequential Monte Carlo-based methods), while paper I and III focus on more recent machine learning methods (deep learning and normalizing flows, respectively).Paper I constructs novel deep learning methods for learning summary statistics for approximate Bayesian computation (ABC). To achieve this paper I introduces the partially exchangeable network (PEN), a deep learning architecture specifically designed for Markovian data (i.e., partially exchangeable data).Paper II considers Bayesian inference in stochastic differential equation mixed-effects models (SDEMEM). Bayesian inference for SDEMEMs is challenging due to the intractable likelihood function of SDEMEMs. Paper II addresses this problem by designing a novel a Gibbs-blocking strategy in combination with correlated pseudo marginal methods. The paper also discusses how custom particle filters can be adapted to the inference procedure.Paper III introduces the novel inference method sequential neural posterior and likelihood approximation (SNPLA). SNPLA is a simulation-based inference algorithm that utilizes normalizing flows for learning both the posterior distribution and the likelihood function of an implicit model via a sequential scheme. By learning both the likelihood and the posterior, and by leveraging the reverse Kullback Leibler (KL) divergence, SNPLA avoids ad-hoc correction steps and Markov chain Monte Carlo (MCMC) sampling.Paper IV introduces the accelerated-delayed acceptance (ADA) algorithm. ADA can be viewed as an extension of the delayed-acceptance (DA) MCMC algorithm that leverages connections between the two likelihood ratios of DA to further accelerate MCMC sampling from the posterior distribution of interest, although our approach introduces an approximation. The main case study of paper IV is a double-well potential stochastic differential equation (DWPSDE) model for protein-folding data (reaction coordinate data)
- …