3 research outputs found
Quasi-symplectic Langevin Variational Autoencoder
Variational autoencoder (VAE) is a very popular and well-investigated
generative model vastly used in neural learning research. To leverage VAE in
practical tasks dealing with a massive dataset of large dimensions it is
required to deal with the difficulty of building low variance evidence lower
bounds (ELBO). Markov ChainMonte Carlo (MCMC) is one of the effective
approaches to tighten the ELBO for approximating the posterior distribution.
Hamiltonian Variational Autoencoder(HVAE) is an effective MCMC inspired
approach for constructing a low-variance ELBO which is also amenable to the
reparameterization trick. In this work, we propose a Quasi-symplectic Langevin
Variational autoencoder (Langevin-VAE) by incorporating the gradients
information in the inference process through the Langevin dynamic. We show the
effectiveness of the proposed approach by toy and real-world examples
Dynamical Sampling with Langevin Normalization Flows
In Bayesian machine learning, sampling methods provide the asymptotically unbiased estimation for the inference of the complex probability distributions, where Markov chain Monte Carlo (MCMC) is one of the most popular sampling methods. However, MCMC can lead to high autocorrelation of samples or poor performances in some complex distributions. In this paper, we introduce Langevin diffusions to normalization flows to construct a brand-new dynamical sampling method. We propose the modified Kullback-Leibler divergence as the loss function to train the sampler, which ensures that the samples generated from the proposed method can converge to the target distribution. Since the gradient function of the target distribution is used during the process of calculating the modified Kullback-Leibler, which makes the integral of the modified Kullback-Leibler intractable. We utilize the Monte Carlo estimator to approximate this integral. We also discuss the situation when the target distribution is unnormalized. We illustrate the properties and performances of the proposed method on varieties of complex distributions and real datasets. The experiments indicate that the proposed method not only takes the advantage of the flexibility of neural networks but also utilizes the property of rapid convergence to the target distribution of the dynamics system and demonstrate superior performances competing with dynamics based MCMC samplers