2,379 research outputs found
Variational Hamiltonian Monte Carlo via Score Matching
Traditionally, the field of computational Bayesian statistics has been
divided into two main subfields: variational methods and Markov chain Monte
Carlo (MCMC). In recent years, however, several methods have been proposed
based on combining variational Bayesian inference and MCMC simulation in order
to improve their overall accuracy and computational efficiency. This marriage
of fast evaluation and flexible approximation provides a promising means of
designing scalable Bayesian inference methods. In this paper, we explore the
possibility of incorporating variational approximation into a state-of-the-art
MCMC method, Hamiltonian Monte Carlo (HMC), to reduce the required gradient
computation in the simulation of Hamiltonian flow, which is the bottleneck for
many applications of HMC in big data problems. To this end, we use a {\it
free-form} approximation induced by a fast and flexible surrogate function
based on single-hidden layer feedforward neural networks. The surrogate
provides sufficiently accurate approximation while allowing for fast
exploration of parameter space, resulting in an efficient approximate inference
algorithm. We demonstrate the advantages of our method on both synthetic and
real data problems
Recurrent Neural Filters: Learning Independent Bayesian Filtering Steps for Time Series Prediction
Despite the recent popularity of deep generative state space models, few
comparisons have been made between network architectures and the inference
steps of the Bayesian filtering framework -- with most models simultaneously
approximating both state transition and update steps with a single recurrent
neural network (RNN). In this paper, we introduce the Recurrent Neural Filter
(RNF), a novel recurrent autoencoder architecture that learns distinct
representations for each Bayesian filtering step, captured by a series of
encoders and decoders. Testing this on three real-world time series datasets,
we demonstrate that the decoupled representations learnt not only improve the
accuracy of one-step-ahead forecasts while providing realistic uncertainty
estimates, but also facilitate multistep prediction through the separation of
encoder stages
Orthogonally Decoupled Variational Gaussian Processes
Gaussian processes (GPs) provide a powerful non-parametric framework for
reasoning over functions. Despite appealing theory, its superlinear
computational and memory complexities have presented a long-standing challenge.
State-of-the-art sparse variational inference methods trade modeling accuracy
against complexity. However, the complexities of these methods still scale
superlinearly in the number of basis functions, implying that that sparse GP
methods are able to learn from large datasets only when a small model is used.
Recently, a decoupled approach was proposed that removes the unnecessary
coupling between the complexities of modeling the mean and the covariance
functions of a GP. It achieves a linear complexity in the number of mean
parameters, so an expressive posterior mean function can be modeled. While
promising, this approach suffers from optimization difficulties due to
ill-conditioning and non-convexity. In this work, we propose an alternative
decoupled parametrization. It adopts an orthogonal basis in the mean function
to model the residues that cannot be learned by the standard coupled approach.
Therefore, our method extends, rather than replaces, the coupled approach to
achieve strictly better performance. This construction admits a straightforward
natural gradient update rule, so the structure of the information manifold that
is lost during decoupling can be leveraged to speed up learning. Empirically,
our algorithm demonstrates significantly faster convergence in multiple
experiments.Comment: Appearing NIPS 201
- …