89 research outputs found
Convergence of Langevin MCMC in KL-divergence
Langevin diffusion is a commonly used tool for sampling from a given
distribution. In this work, we establish that when the target density is
such that is smooth and strongly convex, discrete Langevin
diffusion produces a distribution with in
steps, where is the dimension of the sample
space. We also study the convergence rate when the strong-convexity assumption
is absent. By considering the Langevin diffusion as a gradient flow in the
space of probability distributions, we obtain an elegant analysis that applies
to the stronger property of convergence in KL-divergence and gives a
conceptually simpler proof of the best-known convergence results in weaker
metrics
Implicit Langevin Algorithms for Sampling From Log-concave Densities
For sampling from a log-concave density, we study implicit integrators
resulting from -method discretization of the overdamped Langevin
diffusion stochastic differential equation. Theoretical and algorithmic
properties of the resulting sampling methods for and a
range of step sizes are established. Our results generalize and extend prior
works in several directions. In particular, for , we prove
geometric ergodicity and stability of the resulting methods for all step sizes.
We show that obtaining subsequent samples amounts to solving a strongly-convex
optimization problem, which is readily achievable using one of numerous
existing methods. Numerical examples supporting our theoretical analysis are
also presented
Neural Sampling in Hierarchical Exponential-family Energy-based Models
Bayesian brain theory suggests that the brain employs generative models to
understand the external world. The sampling-based perspective posits that the
brain infers the posterior distribution through samples of stochastic neuronal
responses. Additionally, the brain continually updates its generative model to
approach the true distribution of the external world. In this study, we
introduce the Hierarchical Exponential-family Energy-based (HEE) model, which
captures the dynamics of inference and learning. In the HEE model, we decompose
the partition function into individual layers and leverage a group of neurons
with shorter time constants to sample the gradient of the decomposed
normalization term. This allows our model to estimate the partition function
and perform inference simultaneously, circumventing the negative phase
encountered in conventional energy-based models (EBMs). As a result, the
learning process is localized both in time and space, and the model is easy to
converge. To match the brain's rapid computation, we demonstrate that neural
adaptation can serve as a momentum term, significantly accelerating the
inference process. On natural image datasets, our model exhibits
representations akin to those observed in the biological visual system.
Furthermore, for the machine learning community, our model can generate
observations through joint or marginal generation. We show that marginal
generation outperforms joint generation and achieves performance on par with
other EBMs.Comment: NeurIPS 202
A Particle-Based Algorithm for Distributional Optimization on \textit{Constrained Domains} via Variational Transport and Mirror Descent
We consider the optimization problem of minimizing an objective functional,
which admits a variational form and is defined over probability distributions
on the constrained domain, which poses challenges to both theoretical analysis
and algorithmic design. Inspired by the mirror descent algorithm for
constrained optimization, we propose an iterative particle-based algorithm,
named Mirrored Variational Transport (mirrorVT), extended from the Variational
Transport framework [7] for dealing with the constrained domain. In particular,
for each iteration, mirrorVT maps particles to an unconstrained dual domain
induced by a mirror map and then approximately perform Wasserstein gradient
descent on the manifold of distributions defined over the dual space by pushing
particles. At the end of iteration, particles are mapped back to the original
constrained domain. Through simulated experiments, we demonstrate the
effectiveness of mirrorVT for minimizing the functionals over probability
distributions on the simplex- and Euclidean ball-constrained domains. We also
analyze its theoretical properties and characterize its convergence to the
global minimum of the objective functional
- …