140,107 research outputs found

    Bayesian Learning of Neural Networks for Signal/Background Discrimination in Particle Physics

    Full text link
    Neural networks are used extensively in classification problems in particle physics research. Since the training of neural networks can be viewed as a problem of inference, Bayesian learning of neural networks can provide more optimal and robust results than conventional learning methods. We have investigated the use of Bayesian neural networks for signal/background discrimination in the search for second generation leptoquarks at the Tevatron, as an example. We present a comparison of the results obtained from the conventional training of feedforward neural networks and networks trained with Bayesian methods.Comment: 3 pages, 4 figures, conference proceeding

    Bayesian Neural Networks

    Full text link
    This paper describes and discusses Bayesian Neural Network (BNN). The paper showcases a few different applications of them for classification and regression problems. BNNs are comprised of a Probabilistic Model and a Neural Network. The intent of such a design is to combine the strengths of Neural Networks and Stochastic modeling. Neural Networks exhibit continuous function approximator capabilities. Stochastic models allow direct specification of a model with known interaction between parameters to generate data. During the prediction phase, stochastic models generate a complete posterior distribution and produce probabilistic guarantees on the predictions. Thus BNNs are a unique combination of neural network and stochastic models with the stochastic model forming the core of this integration. BNNs can then produce probabilistic guarantees on it's predictions and also generate the distribution of parameters that it has learnt from the observations. That means, in the parameter space, one can deduce the nature and shape of the neural network's learnt parameters. These two characteristics makes them highly attractive to theoreticians as well as practitioners. Recently there has been a lot of activity in this area, with the advent of numerous probabilistic programming libraries such as: PyMC3, Edward, Stan etc. Further this area is rapidly gaining ground as a standard machine learning approach for numerous problemsComment: arXiv admin note: text overlap with arXiv:1111.4246 by other author

    Adversarial Phenomenon in the Eyes of Bayesian Deep Learning

    Full text link
    Deep Learning models are vulnerable to adversarial examples, i.e.\ images obtained via deliberate imperceptible perturbations, such that the model misclassifies them with high confidence. However, class confidence by itself is an incomplete picture of uncertainty. We therefore use principled Bayesian methods to capture model uncertainty in prediction for observing adversarial misclassification. We provide an extensive study with different Bayesian neural networks attacked in both white-box and black-box setups. The behaviour of the networks for noise, attacks and clean test data is compared. We observe that Bayesian neural networks are uncertain in their predictions for adversarial perturbations, a behaviour similar to the one observed for random Gaussian perturbations. Thus, we conclude that Bayesian neural networks can be considered for detecting adversarial examples.Comment: 13 pages, 7 figure

    Variational Bayes: A report on approaches and applications

    Full text link
    Deep neural networks have achieved impressive results on a wide variety of tasks. However, quantifying uncertainty in the network's output is a challenging task. Bayesian models offer a mathematical framework to reason about model uncertainty. Variational methods have been used for approximating intractable integrals that arise in Bayesian inference for neural networks. In this report, we review the major variational inference concepts pertinent to Bayesian neural networks and compare various approximation methods used in literature. We also talk about the applications of variational bayes in Reinforcement learning and continual learning

    Neural Networks Processing Mean Values of Random Variables

    Full text link
    We introduce a class of neural networks derived from probabilistic models in the form of Bayesian belief networks. By imposing additional assumptions about the nature of the probabilistic models represented in the belief networks, we derive neural networks with standard dynamics that require no training to determine the synaptic weights, that can pool multiple sources of evidence, and that deal cleanly and consistently with inconsistent or contradictory evidence. The presented neural networks capture many properties of Bayesian belief networks, providing distributed versions of probabilistic models.Comment: 7 pages, 3 figures, 1 table, submitted to Phys Rev

    Designing neural networks that process mean values of random variables

    Full text link
    We introduce a class of neural networks derived from probabilistic models in the form of Bayesian networks. By imposing additional assumptions about the nature of the probabilistic models represented in the networks, we derive neural networks with standard dynamics that require no training to determine the synaptic weights, that perform accurate calculation of the mean values of the random variables, that can pool multiple sources of evidence, and that deal cleanly and consistently with inconsistent or contradictory evidence. The presented neural networks capture many properties of Bayesian networks, providing distributed versions of probabilistic models.Comment: 13 pages, elsarticl

    Deep Neural Networks as Gaussian Processes

    Full text link
    It has long been known that a single-layer fully-connected neural network with an i.i.d. prior over its parameters is equivalent to a Gaussian process (GP), in the limit of infinite network width. This correspondence enables exact Bayesian inference for infinite width neural networks on regression tasks by means of evaluating the corresponding GP. Recently, kernel functions which mimic multi-layer random neural networks have been developed, but only outside of a Bayesian framework. As such, previous work has not identified that these kernels can be used as covariance functions for GPs and allow fully Bayesian prediction with a deep neural network. In this work, we derive the exact equivalence between infinitely wide deep networks and GPs. We further develop a computationally efficient pipeline to compute the covariance function for these GPs. We then use the resulting GPs to perform Bayesian inference for wide deep neural networks on MNIST and CIFAR-10. We observe that trained neural network accuracy approaches that of the corresponding GP with increasing layer width, and that the GP uncertainty is strongly correlated with trained network prediction error. We further find that test performance increases as finite-width trained networks are made wider and more similar to a GP, and thus that GP predictions typically outperform those of finite-width networks. Finally we connect the performance of these GPs to the recent theory of signal propagation in random neural networks.Comment: Published version in ICLR 2018. 10 pages + appendi

    Bayesian Recurrent Neural Networks

    Full text link
    In this work we explore a straightforward variational Bayes scheme for Recurrent Neural Networks. Firstly, we show that a simple adaptation of truncated backpropagation through time can yield good quality uncertainty estimates and superior regularisation at only a small extra computational cost during training, also reducing the amount of parameters by 80\%. Secondly, we demonstrate how a novel kind of posterior approximation yields further improvements to the performance of Bayesian RNNs. We incorporate local gradient information into the approximate posterior to sharpen it around the current batch statistics. We show how this technique is not exclusive to recurrent neural networks and can be applied more widely to train Bayesian neural networks. We also empirically demonstrate how Bayesian RNNs are superior to traditional RNNs on a language modelling benchmark and an image captioning task, as well as showing how each of these methods improve our model over a variety of other schemes for training them. We also introduce a new benchmark for studying uncertainty for language models so future methods can be easily compared.Comment: 12th Women in Machine Learning Workshop (WiML 2017), co-located with the 31st Conference on Neural Information Processing Systems (NeurIPS 2017), Long Beach, CA, US

    Bayesian Neural Networks: Essentials

    Full text link
    Bayesian neural networks utilize probabilistic layers that capture uncertainty over weights and activations, and are trained using Bayesian inference. Since these probabilistic layers are designed to be drop-in replacement of their deterministic counter parts, Bayesian neural networks provide a direct and natural way to extend conventional deep neural networks to support probabilistic deep learning. However, it is nontrivial to understand, design and train Bayesian neural networks due to their complexities. We discuss the essentials of Bayesian neural networks including duality (deep neural networks, probabilistic models), approximate Bayesian inference, Bayesian priors, Bayesian posteriors, and deep variational learning. We use TensorFlow Probability APIs and code examples for illustration. The main problem with Bayesian neural networks is that the architecture of deep neural networks makes it quite redundant, and costly, to account for uncertainty for a large number of successive layers. Hybrid Bayesian neural networks, which use few probabilistic layers judicially positioned in the networks, provide a practical solution

    Ensemble Model Patching: A Parameter-Efficient Variational Bayesian Neural Network

    Full text link
    Two main obstacles preventing the widespread adoption of variational Bayesian neural networks are the high parameter overhead that makes them infeasible on large networks, and the difficulty of implementation, which can be thought of as "programming overhead." MC dropout [Gal and Ghahramani, 2016] is popular because it sidesteps these obstacles. Nevertheless, dropout is often harmful to model performance when used in networks with batch normalization layers [Li et al., 2018], which are an indispensable part of modern neural networks. We construct a general variational family for ensemble-based Bayesian neural networks that encompasses dropout as a special case. We further present two specific members of this family that work well with batch normalization layers, while retaining the benefits of low parameter and programming overhead, comparable to non-Bayesian training. Our proposed methods improve predictive accuracy and achieve almost perfect calibration on a ResNet-18 trained with ImageNet
    • …
    corecore