2,476 research outputs found
Variational Dropout and the Local Reparameterization Trick
We investigate a local reparameterizaton technique for greatly reducing the
variance of stochastic gradients for variational Bayesian inference (SGVB) of a
posterior over model parameters, while retaining parallelizability. This local
reparameterization translates uncertainty about global parameters into local
noise that is independent across datapoints in the minibatch. Such
parameterizations can be trivially parallelized and have variance that is
inversely proportional to the minibatch size, generally leading to much faster
convergence. Additionally, we explore a connection with dropout: Gaussian
dropout objectives correspond to SGVB with local reparameterization, a
scale-invariant prior and proportionally fixed posterior variance. Our method
allows inference of more flexibly parameterized posteriors; specifically, we
propose variational dropout, a generalization of Gaussian dropout where the
dropout rates are learned, often leading to better models. The method is
demonstrated through several experiments
Hybrid Models with Deep and Invertible Features
We propose a neural hybrid model consisting of a linear model defined on a
set of features computed by a deep, invertible transformation (i.e. a
normalizing flow). An attractive property of our model is that both
p(features), the density of the features, and p(targets | features), the
predictive distribution, can be computed exactly in a single feed-forward pass.
We show that our hybrid model, despite the invertibility constraints, achieves
similar accuracy to purely predictive models. Moreover the generative component
remains a good model of the input features despite the hybrid optimization
objective. This offers additional capabilities such as detection of
out-of-distribution inputs and enabling semi-supervised learning. The
availability of the exact joint density p(targets, features) also allows us to
compute many quantities readily, making our hybrid model a useful building
block for downstream applications of probabilistic deep learning.Comment: ICML 201
Bayesian Deep Net GLM and GLMM
Deep feedforward neural networks (DFNNs) are a powerful tool for functional
approximation. We describe flexible versions of generalized linear and
generalized linear mixed models incorporating basis functions formed by a DFNN.
The consideration of neural networks with random effects is not widely used in
the literature, perhaps because of the computational challenges of
incorporating subject specific parameters into already complex models.
Efficient computational methods for high-dimensional Bayesian inference are
developed using Gaussian variational approximation, with a parsimonious but
flexible factor parametrization of the covariance matrix. We implement natural
gradient methods for the optimization, exploiting the factor structure of the
variational covariance matrix in computation of the natural gradient. Our
flexible DFNN models and Bayesian inference approach lead to a regression and
classification method that has a high prediction accuracy, and is able to
quantify the prediction uncertainty in a principled and convenient way. We also
describe how to perform variable selection in our deep learning method. The
proposed methods are illustrated in a wide range of simulated and real-data
examples, and the results compare favourably to a state of the art flexible
regression and classification method in the statistical literature, the
Bayesian additive regression trees (BART) method. User-friendly software
packages in Matlab, R and Python implementing the proposed methods are
available at https://github.com/VBayesLabComment: 35 pages, 7 figure, 10 table
- …