42,503 research outputs found

    Bayesian Deep Net GLM and GLMM

    Full text link
    Deep feedforward neural networks (DFNNs) are a powerful tool for functional approximation. We describe flexible versions of generalized linear and generalized linear mixed models incorporating basis functions formed by a DFNN. The consideration of neural networks with random effects is not widely used in the literature, perhaps because of the computational challenges of incorporating subject specific parameters into already complex models. Efficient computational methods for high-dimensional Bayesian inference are developed using Gaussian variational approximation, with a parsimonious but flexible factor parametrization of the covariance matrix. We implement natural gradient methods for the optimization, exploiting the factor structure of the variational covariance matrix in computation of the natural gradient. Our flexible DFNN models and Bayesian inference approach lead to a regression and classification method that has a high prediction accuracy, and is able to quantify the prediction uncertainty in a principled and convenient way. We also describe how to perform variable selection in our deep learning method. The proposed methods are illustrated in a wide range of simulated and real-data examples, and the results compare favourably to a state of the art flexible regression and classification method in the statistical literature, the Bayesian additive regression trees (BART) method. User-friendly software packages in Matlab, R and Python implementing the proposed methods are available at https://github.com/VBayesLabComment: 35 pages, 7 figure, 10 table

    Neural Network Gradient Hamiltonian Monte Carlo

    Full text link
    Hamiltonian Monte Carlo is a widely used algorithm for sampling from posterior distributions of complex Bayesian models. It can efficiently explore high-dimensional parameter spaces guided by simulated Hamiltonian flows. However, the algorithm requires repeated gradient calculations, and these computations become increasingly burdensome as data sets scale. We present a method to substantially reduce the computation burden by using a neural network to approximate the gradient. First, we prove that the proposed method still maintains convergence to the true distribution though the approximated gradient no longer comes from a Hamiltonian system. Second, we conduct experiments on synthetic examples and real data sets to validate the proposed method

    A Coverage Study of the CMSSM Based on ATLAS Sensitivity Using Fast Neural Networks Techniques

    Get PDF
    We assess the coverage properties of confidence and credible intervals on the CMSSM parameter space inferred from a Bayesian posterior and the profile likelihood based on an ATLAS sensitivity study. In order to make those calculations feasible, we introduce a new method based on neural networks to approximate the mapping between CMSSM parameters and weak-scale particle masses. Our method reduces the computational effort needed to sample the CMSSM parameter space by a factor of ~ 10^4 with respect to conventional techniques. We find that both the Bayesian posterior and the profile likelihood intervals can significantly over-cover and identify the origin of this effect to physical boundaries in the parameter space. Finally, we point out that the effects intrinsic to the statistical procedure are conflated with simplifications to the likelihood functions from the experiments themselves.Comment: Further checks about accuracy of neural network approximation, fixed typos, added refs. Main results unchanged. Matches version accepted by JHE

    Network Plasticity as Bayesian Inference

    Full text link
    General results from statistical learning theory suggest to understand not only brain computations, but also brain plasticity as probabilistic inference. But a model for that has been missing. We propose that inherently stochastic features of synaptic plasticity and spine motility enable cortical networks of neurons to carry out probabilistic inference by sampling from a posterior distribution of network configurations. This model provides a viable alternative to existing models that propose convergence of parameters to maximum likelihood values. It explains how priors on weight distributions and connection probabilities can be merged optimally with learned experience, how cortical networks can generalize learned information so well to novel experiences, and how they can compensate continuously for unforeseen disturbances of the network. The resulting new theory of network plasticity explains from a functional perspective a number of experimental data on stochastic aspects of synaptic plasticity that previously appeared to be quite puzzling.Comment: 33 pages, 5 figures, the supplement is available on the author's web page http://www.igi.tugraz.at/kappe

    BAMBI: blind accelerated multimodal Bayesian inference

    Full text link
    In this paper we present an algorithm for rapid Bayesian analysis that combines the benefits of nested sampling and artificial neural networks. The blind accelerated multimodal Bayesian inference (BAMBI) algorithm implements the MultiNest package for nested sampling as well as the training of an artificial neural network (NN) to learn the likelihood function. In the case of computationally expensive likelihoods, this allows the substitution of a much more rapid approximation in order to increase significantly the speed of the analysis. We begin by demonstrating, with a few toy examples, the ability of a NN to learn complicated likelihood surfaces. BAMBI's ability to decrease running time for Bayesian inference is then demonstrated in the context of estimating cosmological parameters from Wilkinson Microwave Anisotropy Probe and other observations. We show that valuable speed increases are achieved in addition to obtaining NNs trained on the likelihood functions for the different model and data combinations. These NNs can then be used for an even faster follow-up analysis using the same likelihood and different priors. This is a fully general algorithm that can be applied, without any pre-processing, to other problems with computationally expensive likelihood functions.Comment: 12 pages, 8 tables, 17 figures; accepted by MNRAS; v2 to reflect minor changes in published versio
    corecore