3 research outputs found

    Quasi-symplectic Langevin Variational Autoencoder

    Get PDF
    Variational autoencoder (VAE) is a very popular and well-investigated generative model vastly used in neural learning research. To leverage VAE in practical tasks dealing with a massive dataset of large dimensions it is required to deal with the difficulty of building low variance evidence lower bounds (ELBO). Markov ChainMonte Carlo (MCMC) is one of the effective approaches to tighten the ELBO for approximating the posterior distribution. Hamiltonian Variational Autoencoder(HVAE) is an effective MCMC inspired approach for constructing a low-variance ELBO which is also amenable to the reparameterization trick. In this work, we propose a Quasi-symplectic Langevin Variational autoencoder (Langevin-VAE) by incorporating the gradients information in the inference process through the Langevin dynamic. We show the effectiveness of the proposed approach by toy and real-world examples

    Dynamical Sampling with Langevin Normalization Flows

    No full text
    In Bayesian machine learning, sampling methods provide the asymptotically unbiased estimation for the inference of the complex probability distributions, where Markov chain Monte Carlo (MCMC) is one of the most popular sampling methods. However, MCMC can lead to high autocorrelation of samples or poor performances in some complex distributions. In this paper, we introduce Langevin diffusions to normalization flows to construct a brand-new dynamical sampling method. We propose the modified Kullback-Leibler divergence as the loss function to train the sampler, which ensures that the samples generated from the proposed method can converge to the target distribution. Since the gradient function of the target distribution is used during the process of calculating the modified Kullback-Leibler, which makes the integral of the modified Kullback-Leibler intractable. We utilize the Monte Carlo estimator to approximate this integral. We also discuss the situation when the target distribution is unnormalized. We illustrate the properties and performances of the proposed method on varieties of complex distributions and real datasets. The experiments indicate that the proposed method not only takes the advantage of the flexibility of neural networks but also utilizes the property of rapid convergence to the target distribution of the dynamics system and demonstrate superior performances competing with dynamics based MCMC samplers
    corecore