171 research outputs found
Provable Bayesian Inference via Particle Mirror Descent
Bayesian methods are appealing in their flexibility in modeling complex data
and ability in capturing uncertainty in parameters. However, when Bayes' rule
does not result in tractable closed-form, most approximate inference algorithms
lack either scalability or rigorous guarantees. To tackle this challenge, we
propose a simple yet provable algorithm, \emph{Particle Mirror Descent} (PMD),
to iteratively approximate the posterior density. PMD is inspired by stochastic
functional mirror descent where one descends in the density space using a small
batch of data points at each iteration, and by particle filtering where one
uses samples to approximate a function. We prove result of the first kind that,
with particles, PMD provides a posterior density estimator that converges
in terms of -divergence to the true posterior in rate . We
demonstrate competitive empirical performances of PMD compared to several
approximate inference algorithms in mixture models, logistic regression, sparse
Gaussian processes and latent Dirichlet allocation on large scale datasets.Comment: 38 pages, 26 figure
Adaptive Variational Particle Filtering in Non-stationary Environments
Online convex optimization is a sequential prediction framework with the goal
to track and adapt to the environment through evaluating proper convex loss
functions. We study efficient particle filtering methods from the perspective
of such a framework.
We formulate an efficient particle filtering methods for the non-stationary
environment by making connections with the online mirror descent algorithm
which is known to be a universal online convex optimization algorithm.
As a result of this connection, our proposed particle filtering algorithm
proves to achieve optimal particle efficiency
Scalable Training of Inference Networks for Gaussian-Process Models
Inference in Gaussian process (GP) models is computationally challenging for
large data, and often difficult to approximate with a small number of inducing
points. We explore an alternative approximation that employs stochastic
inference networks for a flexible inference. Unfortunately, for such networks,
minibatch training is difficult to be able to learn meaningful correlations
over function outputs for a large dataset. We propose an algorithm that enables
such training by tracking a stochastic, functional mirror-descent algorithm. At
each iteration, this only requires considering a finite number of input
locations, resulting in a scalable and easy-to-implement algorithm. Empirical
results show comparable and, sometimes, superior performance to existing sparse
variational GP methods.Comment: ICML 2019. Update results added in the camera-ready versio
Distributed Learning for Cooperative Inference
We study the problem of cooperative inference where a group of agents
interact over a network and seek to estimate a joint parameter that best
explains a set of observations. Agents do not know the network topology or the
observations of other agents. We explore a variational interpretation of the
Bayesian posterior density, and its relation to the stochastic mirror descent
algorithm, to propose a new distributed learning algorithm. We show that, under
appropriate assumptions, the beliefs generated by the proposed algorithm
concentrate around the true parameter exponentially fast. We provide explicit
non-asymptotic bounds for the convergence rate. Moreover, we develop explicit
and computationally efficient algorithms for observation models belonging to
exponential families
Particle Flow Bayes' Rule
We present a particle flow realization of Bayes' rule, where an ODE-based
neural operator is used to transport particles from a prior to its posterior
after a new observation. We prove that such an ODE operator exists. Its neural
parameterization can be trained in a meta-learning framework, allowing this
operator to reason about the effect of an individual observation on the
posterior, and thus generalize across different priors, observations and to
sequential Bayesian inference. We demonstrated the generalization ability of
our particle flow Bayes operator in several canonical and high dimensional
examples
Mirror Descent Search and its Acceleration
In recent years, attention has been focused on the relationship between
black-box optimiza- tion problem and reinforcement learning problem. In this
research, we propose the Mirror Descent Search (MDS) algorithm which is
applicable both for black box optimization prob- lems and reinforcement
learning problems. Our method is based on the mirror descent method, which is a
general optimization algorithm. The contribution of this research is roughly
twofold. We propose two essential algorithms, called MDS and Accelerated Mirror
Descent Search (AMDS), and two more approximate algorithms: Gaussian Mirror
Descent Search (G-MDS) and Gaussian Accelerated Mirror Descent Search (G-AMDS).
This re- search shows that the advanced methods developed in the context of the
mirror descent research can be applied to reinforcement learning problem. We
also clarify the relationship between an existing reinforcement learning
algorithm and our method. With two evaluation experiments, we show our proposed
algorithms converge faster than some state-of-the-art methods.Comment: Gold open access in Journal of Robotics and Autonomous Systems:
https://www.sciencedirect.com/science/article/pii/S092188901730754
Guaranteed inference in topic models
One of the core problems in statistical models is the estimation of a
posterior distribution. For topic models, the problem of posterior inference
for individual texts is particularly important, especially when dealing with
data streams, but is often intractable in the worst case. As a consequence,
existing methods for posterior inference are approximate and do not have any
guarantee on neither quality nor convergence rate. In this paper, we introduce
a provably fast algorithm, namely Online Maximum a Posteriori Estimation (OPE),
for posterior inference in topic models. OPE has more attractive properties
than existing inference approaches, including theoretical guarantees on quality
and fast rate of convergence to a local maximal/stationary point of the
inference problem. The discussions about OPE are very general and hence can be
easily employed in a wide range of contexts. Finally, we employ OPE to design
three methods for learning Latent Dirichlet Allocation from text streams or
large corpora. Extensive experiments demonstrate some superior behaviors of OPE
and of our new learning methods
Wasserstein variational gradient descent: From semi-discrete optimal transport to ensemble variational inference
Particle-based variational inference offers a flexible way of approximating
complex posterior distributions with a set of particles. In this paper we
introduce a new particle-based variational inference method based on the theory
of semi-discrete optimal transport. Instead of minimizing the KL divergence
between the posterior and the variational approximation, we minimize a
semi-discrete optimal transport divergence. The solution of the resulting
optimal transport problem provides both a particle approximation and a set of
optimal transportation densities that map each particle to a segment of the
posterior distribution. We approximate these transportation densities by
minimizing the KL divergence between a truncated distribution and the optimal
transport solution. The resulting algorithm can be interpreted as a form of
ensemble variational inference where each particle is associated with a local
variational approximation
Kernel Implicit Variational Inference
Recent progress in variational inference has paid much attention to the
flexibility of variational posteriors. One promising direction is to use
implicit distributions, i.e., distributions without tractable densities as the
variational posterior. However, existing methods on implicit posteriors still
face challenges of noisy estimation and computational infeasibility when
applied to models with high-dimensional latent variables. In this paper, we
present a new approach named Kernel Implicit Variational Inference that
addresses these challenges. As far as we know, for the first time implicit
variational inference is successfully applied to Bayesian neural networks,
which shows promising results on both regression and classification tasks.Comment: Published as a conference paper at ICLR 201
A stochastic version of Stein Variational Gradient Descent for efficient sampling
We propose in this work RBM-SVGD, a stochastic version of Stein Variational
Gradient Descent (SVGD) method for efficiently sampling from a given
probability measure and thus useful for Bayesian inference. The method is to
apply the Random Batch Method (RBM) for interacting particle systems proposed
by Jin et al to the interacting particle systems in SVGD. While keeping the
behaviors of SVGD, it reduces the computational cost, especially when the
interacting kernel has long range. Numerical examples verify the efficiency of
this new version of SVGD
- …