Search CORE

106,884 research outputs found

Bayesian neural networks via MCMC: a Python-based tutorial

Author: Chandra Rohitash
Chen Royce
Simmons Joshua
Publication venue
Publication date: 01/04/2023
Field of study

Bayesian inference provides a methodology for parameter estimation and uncertainty quantification in machine learning and deep learning methods. Variational inference and Markov Chain Monte-Carlo (MCMC) sampling techniques are used to implement Bayesian inference. In the past three decades, MCMC methods have faced a number of challenges in being adapted to larger models (such as in deep learning) and big data problems. Advanced proposals that incorporate gradients, such as a Langevin proposal distribution, provide a means to address some of the limitations of MCMC sampling for Bayesian neural networks. Furthermore, MCMC methods have typically been constrained to use by statisticians and are still not prominent among deep learning researchers. We present a tutorial for MCMC methods that covers simple Bayesian linear and logistic models, and Bayesian neural networks. The aim of this tutorial is to bridge the gap between theory and implementation via coding, given a general sparsity of libraries and tutorials to this end. This tutorial provides code in Python with data and instructions that enable their use and extension. We provide results for some benchmark problems showing the strengths and weaknesses of implementing the respective Bayesian models via MCMC. We highlight the challenges in sampling multi-modal posterior distributions in particular for the case of Bayesian neural networks, and the need for further improvement of convergence diagnosis

arXiv.org e-Print Archive

Partially Exchangeable Networks and Architectures for Learning Summary Statistics in Approximate Bayesian Computation

Author: Frellsen Jes
Mattei Pierre-Alexandre
Picchini Umberto
Wiqvist Samuel
Publication venue
Publication date: 01/01/2019
Field of study

We present a novel family of deep neural architectures, named partially exchangeable networks (PENs) that leverage probabilistic symmetries. By design, PENs are invariant to block-switch transformations, which characterize the partial exchangeability properties of conditionally Markovian processes. Moreover, we show that any block-switch invariant function has a PEN-like representation. The DeepSets architecture is a special case of PEN and we can therefore also target fully exchangeable data. We employ PENs to learn summary statistics in approximate Bayesian computation (ABC). When comparing PENs to previous deep learning methods for learning summary statistics, our results are highly competitive, both considering time series and static models. Indeed, PENs provide more reliable posterior samples even when using less training data.Comment: Forthcoming on the Proceedings of ICML 2019. New comparisons with several different networks. We now use the Wasserstein distance to produce comparisons. Code available on GitHub. 16 pages, 5 figures, 21 table

arXiv.org e-Print Archive

Lund University Publications

The IT University of Copenhagen's Repository

Chalmers Research

Simulation-based Inference : From Approximate Bayesian Computation and Particle Methods to Neural Density Estimation

Author: Wiqvist Samuel
Publication venue: Lund University (Media-Tryck)
Publication date: 16/08/2021
Field of study

This doctoral thesis in computational statistics utilizes both Monte Carlo methods(approximate Bayesian computation and sequential Monte Carlo) and machine-learning methods (deep learning and normalizing flows) to develop novel algorithms for inference in implicit Bayesian models. Implicit models are those for which calculating the likelihood function is very challenging (and often impossible), but model simulation is feasible. The inference methods developed in the thesis are simulation-based inference methods since they leverage the possibility to simulate data from the implicit models. Several approaches are considered in the thesis: Paper II and IV focus on classical methods (sequential Monte Carlo-based methods), while paper I and III focus on more recent machine learning methods (deep learning and normalizing flows, respectively).Paper I constructs novel deep learning methods for learning summary statistics for approximate Bayesian computation (ABC). To achieve this paper I introduces the partially exchangeable network (PEN), a deep learning architecture specifically designed for Markovian data (i.e., partially exchangeable data).Paper II considers Bayesian inference in stochastic differential equation mixed-effects models (SDEMEM). Bayesian inference for SDEMEMs is challenging due to the intractable likelihood function of SDEMEMs. Paper II addresses this problem by designing a novel a Gibbs-blocking strategy in combination with correlated pseudo marginal methods. The paper also discusses how custom particle filters can be adapted to the inference procedure.Paper III introduces the novel inference method sequential neural posterior and likelihood approximation (SNPLA). SNPLA is a simulation-based inference algorithm that utilizes normalizing flows for learning both the posterior distribution and the likelihood function of an implicit model via a sequential scheme. By learning both the likelihood and the posterior, and by leveraging the reverse Kullback Leibler (KL) divergence, SNPLA avoids ad-hoc correction steps and Markov chain Monte Carlo (MCMC) sampling.Paper IV introduces the accelerated-delayed acceptance (ADA) algorithm. ADA can be viewed as an extension of the delayed-acceptance (DA) MCMC algorithm that leverages connections between the two likelihood ratios of DA to further accelerate MCMC sampling from the posterior distribution of interest, although our approach introduces an approximation. The main case study of paper IV is a double-well potential stochastic differential equation (DWPSDE) model for protein-folding data (reaction coordinate data)

Lund University Publications