126 research outputs found
Small-variance asymptotics for Bayesian neural networks
Bayesian neural networks (BNNs) are a rich and flexible class of models that have several advantages over standard feedforward networks, but are typically expensive to train on large-scale data. In this thesis, we explore the use of small-variance asymptotics-an approach to yielding fast algorithms from probabilistic models-on various Bayesian neural network models. We first demonstrate how small-variance asymptotics shows precise connections between standard neural networks and BNNs; for example, particular sampling algorithms for BNNs reduce to standard backpropagation in the small-variance limit. We then explore a more complex BNN where the number of hidden units is additionally treated as a random variable in the model. While standard sampling schemes would be too slow to be practical, our asymptotic approach yields a simple method for extending standard backpropagation to the case where the number of hidden units is not fixed. We show on several data sets that the resulting algorithm has benefits over backpropagation on networks with a fixed architecture.2019-01-02T00:00:00
Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks
Effective training of deep neural networks suffers from two main issues. The
first is that the parameter spaces of these models exhibit pathological
curvature. Recent methods address this problem by using adaptive
preconditioning for Stochastic Gradient Descent (SGD). These methods improve
convergence by adapting to the local geometry of parameter space. A second
issue is overfitting, which is typically addressed by early stopping. However,
recent work has demonstrated that Bayesian model averaging mitigates this
problem. The posterior can be sampled by using Stochastic Gradient Langevin
Dynamics (SGLD). However, the rapidly changing curvature renders default SGLD
methods inefficient. Here, we propose combining adaptive preconditioners with
SGLD. In support of this idea, we give theoretical properties on asymptotic
convergence and predictive risk. We also provide empirical results for Logistic
Regression, Feedforward Neural Nets, and Convolutional Neural Nets,
demonstrating that our preconditioned SGLD method gives state-of-the-art
performance on these models.Comment: AAAI 201
An automatic adaptive method to combine summary statistics in approximate Bayesian computation
To infer the parameters of mechanistic models with intractable likelihoods,
techniques such as approximate Bayesian computation (ABC) are increasingly
being adopted. One of the main disadvantages of ABC in practical situations,
however, is that parameter inference must generally rely on summary statistics
of the data. This is particularly the case for problems involving
high-dimensional data, such as biological imaging experiments. However, some
summary statistics contain more information about parameters of interest than
others, and it is not always clear how to weight their contributions within the
ABC framework. We address this problem by developing an automatic, adaptive
algorithm that chooses weights for each summary statistic. Our algorithm aims
to maximize the distance between the prior and the approximate posterior by
automatically adapting the weights within the ABC distance function.
Computationally, we use a nearest neighbour estimator of the distance between
distributions. We justify the algorithm theoretically based on properties of
the nearest neighbour distance estimator. To demonstrate the effectiveness of
our algorithm, we apply it to a variety of test problems, including several
stochastic models of biochemical reaction networks, and a spatial model of
diffusion, and compare our results with existing algorithms
Reinforcing POD-based model reduction techniques in reaction-diffusion complex networks using stochastic filtering and pattern recognition
Complex networks are used to model many real-world systems. However, the
dimensionality of these systems can make them challenging to analyze.
Dimensionality reduction techniques like POD can be used in such cases.
However, these models are susceptible to perturbations in the input data. We
propose an algorithmic framework that combines techniques from pattern
recognition (PR) and stochastic filtering theory to enhance the output of such
models. The results of our study show that our method can improve the accuracy
of the surrogate model under perturbed inputs. Deep Neural Networks (DNNs) are
susceptible to adversarial attacks. However, recent research has revealed that
Neural Ordinary Differential Equations (neural ODEs) exhibit robustness in
specific applications. We benchmark our algorithmic framework with the neural
ODE-based approach as a reference.Comment: 19 pages, 6 figure
- …