Search CORE

4 research outputs found

Bayesian neural networks via MCMC: a Python-based tutorial

Author: Chandra Rohitash
Chen Royce
Simmons Joshua
Publication venue
Publication date: 01/04/2023
Field of study

Bayesian inference provides a methodology for parameter estimation and uncertainty quantification in machine learning and deep learning methods. Variational inference and Markov Chain Monte-Carlo (MCMC) sampling techniques are used to implement Bayesian inference. In the past three decades, MCMC methods have faced a number of challenges in being adapted to larger models (such as in deep learning) and big data problems. Advanced proposals that incorporate gradients, such as a Langevin proposal distribution, provide a means to address some of the limitations of MCMC sampling for Bayesian neural networks. Furthermore, MCMC methods have typically been constrained to use by statisticians and are still not prominent among deep learning researchers. We present a tutorial for MCMC methods that covers simple Bayesian linear and logistic models, and Bayesian neural networks. The aim of this tutorial is to bridge the gap between theory and implementation via coding, given a general sparsity of libraries and tutorials to this end. This tutorial provides code in Python with data and instructions that enable their use and extension. We provide results for some benchmark problems showing the strengths and weaknesses of implementing the respective Bayesian models via MCMC. We highlight the challenges in sampling multi-modal posterior distributions in particular for the case of Bayesian neural networks, and the need for further improvement of convergence diagnosis

arXiv.org e-Print Archive

A Short Proof of the Posterior Probability Property of Classifier Neural Networks

Author: Ra&apos
Publication venue
Publication date: 01/01/1996
Field of study

expected total error is the sum of the expected individual errors of each output, we can minimize the expected individual errors independently. This means that we need to consider only one output line and when it should produce a 1 or a 0. Assume that input space is divided into a lattice of differential volumes of size dv, each one centered at the n-dimensional point v. If at the output representing class A the network computes the value y(v) 2 [0; 1] for any point x in the differential volume V (v) centered at v, and denoting by p(v) the probability p(Ajx 2 V (v)), then the total expected quadratic error is EA = X V<F2

CiteSeerX

A Short Proof of the Posterior Probability Property of Classifier Neural Networks

Author: Raúl Rojas
Richard M. D.
Publication venue: 'MIT Press - Journals'
Publication date
Field of study

Crossref

Understanding "gardem bench" : studies on the perception of assimilated word forms

Author: Mitterer H.A.
Publication venue: 'University of Maastricht'
Publication date: 01/01/2003
Field of study

Maastricht University Research Portal