140,107 research outputs found
Bayesian Learning of Neural Networks for Signal/Background Discrimination in Particle Physics
Neural networks are used extensively in classification problems in particle
physics research. Since the training of neural networks can be viewed as a
problem of inference, Bayesian learning of neural networks can provide more
optimal and robust results than conventional learning methods. We have
investigated the use of Bayesian neural networks for signal/background
discrimination in the search for second generation leptoquarks at the Tevatron,
as an example. We present a comparison of the results obtained from the
conventional training of feedforward neural networks and networks trained with
Bayesian methods.Comment: 3 pages, 4 figures, conference proceeding
Bayesian Neural Networks
This paper describes and discusses Bayesian Neural Network (BNN). The paper
showcases a few different applications of them for classification and
regression problems. BNNs are comprised of a Probabilistic Model and a Neural
Network. The intent of such a design is to combine the strengths of Neural
Networks and Stochastic modeling. Neural Networks exhibit continuous function
approximator capabilities. Stochastic models allow direct specification of a
model with known interaction between parameters to generate data. During the
prediction phase, stochastic models generate a complete posterior distribution
and produce probabilistic guarantees on the predictions. Thus BNNs are a unique
combination of neural network and stochastic models with the stochastic model
forming the core of this integration. BNNs can then produce probabilistic
guarantees on it's predictions and also generate the distribution of parameters
that it has learnt from the observations. That means, in the parameter space,
one can deduce the nature and shape of the neural network's learnt parameters.
These two characteristics makes them highly attractive to theoreticians as well
as practitioners. Recently there has been a lot of activity in this area, with
the advent of numerous probabilistic programming libraries such as: PyMC3,
Edward, Stan etc. Further this area is rapidly gaining ground as a standard
machine learning approach for numerous problemsComment: arXiv admin note: text overlap with arXiv:1111.4246 by other author
Adversarial Phenomenon in the Eyes of Bayesian Deep Learning
Deep Learning models are vulnerable to adversarial examples, i.e.\ images
obtained via deliberate imperceptible perturbations, such that the model
misclassifies them with high confidence. However, class confidence by itself is
an incomplete picture of uncertainty. We therefore use principled Bayesian
methods to capture model uncertainty in prediction for observing adversarial
misclassification. We provide an extensive study with different Bayesian neural
networks attacked in both white-box and black-box setups. The behaviour of the
networks for noise, attacks and clean test data is compared. We observe that
Bayesian neural networks are uncertain in their predictions for adversarial
perturbations, a behaviour similar to the one observed for random Gaussian
perturbations. Thus, we conclude that Bayesian neural networks can be
considered for detecting adversarial examples.Comment: 13 pages, 7 figure
Variational Bayes: A report on approaches and applications
Deep neural networks have achieved impressive results on a wide variety of
tasks. However, quantifying uncertainty in the network's output is a
challenging task. Bayesian models offer a mathematical framework to reason
about model uncertainty. Variational methods have been used for approximating
intractable integrals that arise in Bayesian inference for neural networks. In
this report, we review the major variational inference concepts pertinent to
Bayesian neural networks and compare various approximation methods used in
literature. We also talk about the applications of variational bayes in
Reinforcement learning and continual learning
Neural Networks Processing Mean Values of Random Variables
We introduce a class of neural networks derived from probabilistic models in
the form of Bayesian belief networks. By imposing additional assumptions about
the nature of the probabilistic models represented in the belief networks, we
derive neural networks with standard dynamics that require no training to
determine the synaptic weights, that can pool multiple sources of evidence, and
that deal cleanly and consistently with inconsistent or contradictory evidence.
The presented neural networks capture many properties of Bayesian belief
networks, providing distributed versions of probabilistic models.Comment: 7 pages, 3 figures, 1 table, submitted to Phys Rev
Designing neural networks that process mean values of random variables
We introduce a class of neural networks derived from probabilistic models in
the form of Bayesian networks. By imposing additional assumptions about the
nature of the probabilistic models represented in the networks, we derive
neural networks with standard dynamics that require no training to determine
the synaptic weights, that perform accurate calculation of the mean values of
the random variables, that can pool multiple sources of evidence, and that deal
cleanly and consistently with inconsistent or contradictory evidence. The
presented neural networks capture many properties of Bayesian networks,
providing distributed versions of probabilistic models.Comment: 13 pages, elsarticl
Deep Neural Networks as Gaussian Processes
It has long been known that a single-layer fully-connected neural network
with an i.i.d. prior over its parameters is equivalent to a Gaussian process
(GP), in the limit of infinite network width. This correspondence enables exact
Bayesian inference for infinite width neural networks on regression tasks by
means of evaluating the corresponding GP. Recently, kernel functions which
mimic multi-layer random neural networks have been developed, but only outside
of a Bayesian framework. As such, previous work has not identified that these
kernels can be used as covariance functions for GPs and allow fully Bayesian
prediction with a deep neural network.
In this work, we derive the exact equivalence between infinitely wide deep
networks and GPs. We further develop a computationally efficient pipeline to
compute the covariance function for these GPs. We then use the resulting GPs to
perform Bayesian inference for wide deep neural networks on MNIST and CIFAR-10.
We observe that trained neural network accuracy approaches that of the
corresponding GP with increasing layer width, and that the GP uncertainty is
strongly correlated with trained network prediction error. We further find that
test performance increases as finite-width trained networks are made wider and
more similar to a GP, and thus that GP predictions typically outperform those
of finite-width networks. Finally we connect the performance of these GPs to
the recent theory of signal propagation in random neural networks.Comment: Published version in ICLR 2018. 10 pages + appendi
Bayesian Recurrent Neural Networks
In this work we explore a straightforward variational Bayes scheme for
Recurrent Neural Networks. Firstly, we show that a simple adaptation of
truncated backpropagation through time can yield good quality uncertainty
estimates and superior regularisation at only a small extra computational cost
during training, also reducing the amount of parameters by 80\%. Secondly, we
demonstrate how a novel kind of posterior approximation yields further
improvements to the performance of Bayesian RNNs. We incorporate local gradient
information into the approximate posterior to sharpen it around the current
batch statistics. We show how this technique is not exclusive to recurrent
neural networks and can be applied more widely to train Bayesian neural
networks. We also empirically demonstrate how Bayesian RNNs are superior to
traditional RNNs on a language modelling benchmark and an image captioning
task, as well as showing how each of these methods improve our model over a
variety of other schemes for training them. We also introduce a new benchmark
for studying uncertainty for language models so future methods can be easily
compared.Comment: 12th Women in Machine Learning Workshop (WiML 2017), co-located with
the 31st Conference on Neural Information Processing Systems (NeurIPS 2017),
Long Beach, CA, US
Bayesian Neural Networks: Essentials
Bayesian neural networks utilize probabilistic layers that capture
uncertainty over weights and activations, and are trained using Bayesian
inference. Since these probabilistic layers are designed to be drop-in
replacement of their deterministic counter parts, Bayesian neural networks
provide a direct and natural way to extend conventional deep neural networks to
support probabilistic deep learning. However, it is nontrivial to understand,
design and train Bayesian neural networks due to their complexities. We discuss
the essentials of Bayesian neural networks including duality (deep neural
networks, probabilistic models), approximate Bayesian inference, Bayesian
priors, Bayesian posteriors, and deep variational learning. We use TensorFlow
Probability APIs and code examples for illustration. The main problem with
Bayesian neural networks is that the architecture of deep neural networks makes
it quite redundant, and costly, to account for uncertainty for a large number
of successive layers. Hybrid Bayesian neural networks, which use few
probabilistic layers judicially positioned in the networks, provide a practical
solution
Ensemble Model Patching: A Parameter-Efficient Variational Bayesian Neural Network
Two main obstacles preventing the widespread adoption of variational Bayesian
neural networks are the high parameter overhead that makes them infeasible on
large networks, and the difficulty of implementation, which can be thought of
as "programming overhead." MC dropout [Gal and Ghahramani, 2016] is popular
because it sidesteps these obstacles. Nevertheless, dropout is often harmful to
model performance when used in networks with batch normalization layers [Li et
al., 2018], which are an indispensable part of modern neural networks. We
construct a general variational family for ensemble-based Bayesian neural
networks that encompasses dropout as a special case. We further present two
specific members of this family that work well with batch normalization layers,
while retaining the benefits of low parameter and programming overhead,
comparable to non-Bayesian training. Our proposed methods improve predictive
accuracy and achieve almost perfect calibration on a ResNet-18 trained with
ImageNet
- …