89 research outputs found

    Convergence of Langevin MCMC in KL-divergence

    Full text link
    Langevin diffusion is a commonly used tool for sampling from a given distribution. In this work, we establish that when the target density pp^* is such that logp\log p^* is LL smooth and mm strongly convex, discrete Langevin diffusion produces a distribution pp with KL(pp)ϵKL(p||p^*)\leq \epsilon in O~(dϵ)\tilde{O}(\frac{d}{\epsilon}) steps, where dd is the dimension of the sample space. We also study the convergence rate when the strong-convexity assumption is absent. By considering the Langevin diffusion as a gradient flow in the space of probability distributions, we obtain an elegant analysis that applies to the stronger property of convergence in KL-divergence and gives a conceptually simpler proof of the best-known convergence results in weaker metrics

    Implicit Langevin Algorithms for Sampling From Log-concave Densities

    Full text link
    For sampling from a log-concave density, we study implicit integrators resulting from θ\theta-method discretization of the overdamped Langevin diffusion stochastic differential equation. Theoretical and algorithmic properties of the resulting sampling methods for θ[0,1] \theta \in [0,1] and a range of step sizes are established. Our results generalize and extend prior works in several directions. In particular, for θ1/2\theta\ge1/2, we prove geometric ergodicity and stability of the resulting methods for all step sizes. We show that obtaining subsequent samples amounts to solving a strongly-convex optimization problem, which is readily achievable using one of numerous existing methods. Numerical examples supporting our theoretical analysis are also presented

    Neural Sampling in Hierarchical Exponential-family Energy-based Models

    Full text link
    Bayesian brain theory suggests that the brain employs generative models to understand the external world. The sampling-based perspective posits that the brain infers the posterior distribution through samples of stochastic neuronal responses. Additionally, the brain continually updates its generative model to approach the true distribution of the external world. In this study, we introduce the Hierarchical Exponential-family Energy-based (HEE) model, which captures the dynamics of inference and learning. In the HEE model, we decompose the partition function into individual layers and leverage a group of neurons with shorter time constants to sample the gradient of the decomposed normalization term. This allows our model to estimate the partition function and perform inference simultaneously, circumventing the negative phase encountered in conventional energy-based models (EBMs). As a result, the learning process is localized both in time and space, and the model is easy to converge. To match the brain's rapid computation, we demonstrate that neural adaptation can serve as a momentum term, significantly accelerating the inference process. On natural image datasets, our model exhibits representations akin to those observed in the biological visual system. Furthermore, for the machine learning community, our model can generate observations through joint or marginal generation. We show that marginal generation outperforms joint generation and achieves performance on par with other EBMs.Comment: NeurIPS 202

    A Particle-Based Algorithm for Distributional Optimization on \textit{Constrained Domains} via Variational Transport and Mirror Descent

    Full text link
    We consider the optimization problem of minimizing an objective functional, which admits a variational form and is defined over probability distributions on the constrained domain, which poses challenges to both theoretical analysis and algorithmic design. Inspired by the mirror descent algorithm for constrained optimization, we propose an iterative particle-based algorithm, named Mirrored Variational Transport (mirrorVT), extended from the Variational Transport framework [7] for dealing with the constrained domain. In particular, for each iteration, mirrorVT maps particles to an unconstrained dual domain induced by a mirror map and then approximately perform Wasserstein gradient descent on the manifold of distributions defined over the dual space by pushing particles. At the end of iteration, particles are mapped back to the original constrained domain. Through simulated experiments, we demonstrate the effectiveness of mirrorVT for minimizing the functionals over probability distributions on the simplex- and Euclidean ball-constrained domains. We also analyze its theoretical properties and characterize its convergence to the global minimum of the objective functional
    corecore