5 research outputs found
Stochastic Approximate Gradient Descent via the Langevin Algorithm
We introduce a novel and efficient algorithm called the stochastic
approximate gradient descent (SAGD), as an alternative to the stochastic
gradient descent for cases where unbiased stochastic gradients cannot be
trivially obtained. Traditional methods for such problems rely on
general-purpose sampling techniques such as Markov chain Monte Carlo, which
typically requires manual intervention for tuning parameters and does not work
efficiently in practice. Instead, SAGD makes use of the Langevin algorithm to
construct stochastic gradients that are biased in finite steps but accurate
asymptotically, enabling us to theoretically establish the convergence
guarantee for SAGD. Inspired by our theoretical analysis, we also provide
useful guidelines for its practical implementation. Finally, we show that SAGD
performs well experimentally in popular statistical and machine learning
problems such as the expectation-maximization algorithm and the variational
autoencoders
CoinEM: Tuning-Free Particle-Based Variational Inference for Latent Variable Models
We introduce two new particle-based algorithms for learning latent variable
models via marginal maximum likelihood estimation, including one which is
entirely tuning-free. Our methods are based on the perspective of marginal
maximum likelihood estimation as an optimization problem: namely, as the
minimization of a free energy functional. One way to solve this problem is to
consider the discretization of a gradient flow associated with the free energy.
We study one such approach, which resembles an extension of the popular Stein
variational gradient descent algorithm. In particular, we establish a descent
lemma for this algorithm, which guarantees that the free energy decreases at
each iteration. This method, and any other obtained as the discretization of
the gradient flow, will necessarily depend on a learning rate which must be
carefully tuned by the practitioner in order to ensure convergence at a
suitable rate. With this in mind, we also propose another algorithm for
optimizing the free energy which is entirely learning rate free, based on coin
betting techniques from convex optimization. We validate the performance of our
algorithms across a broad range of numerical experiments, including several
high-dimensional settings. Our results are competitive with existing
particle-based methods, without the need for any hyperparameter tuning
Scalable particle-based alternatives to EM
(Neal and Hinton, 1998) recast the problem tackled by EM as the minimization
of a free energy functional on an infinite-dimensional space and EM itself
as coordinate descent applied to . Here, we explore alternative ways to
optimize the functional. In particular, we identify various gradient flows
associated with and show that their limits coincide with 's stationary
points. By discretizing the flows, we obtain three practical particle-based
algorithms for maximum likelihood estimation in broad classes of latent
variable models. The novel algorithms scale well to high-dimensional settings
and outperform existing state-of-the-art methods in experiments
CoinEM:Tuning-Free Particle-Based Variational Inference for Latent Variable Models
We introduce two new particle-based algorithms for learning latent variable models via marginal maximum likelihood estimation, including one which is entirely tuning-free. Our methods are based on the perspective of marginal maximum likelihood estimation as an optimization problem: namely, as the minimization of a free energy functional. One way to solve this problem is to consider the discretization of a gradient flow associated with the free energy. We study one such approach, which resembles an extension of the popular Stein variational gradient descent algorithm. In particular, we establish a descent lemma for this algorithm, which guarantees that the free energy decreases at each iteration. This method, and any other obtained as the discretization of the gradient flow, will necessarily depend on a learning rate which must be carefully tuned by the practitioner in order to ensure convergence at a suitable rate. With this in mind, we also propose another algorithm for optimizing the free energy which is entirely learning rate free, based on coin betting techniques from convex optimization. We validate the performance of our algorithms across a broad range of numerical experiments, including several high-dimensional settings. Our results are competitive with existing particle-based methods, without the need for any hyperparameter tuning