5 research outputs found

    Stochastic Approximate Gradient Descent via the Langevin Algorithm

    Full text link
    We introduce a novel and efficient algorithm called the stochastic approximate gradient descent (SAGD), as an alternative to the stochastic gradient descent for cases where unbiased stochastic gradients cannot be trivially obtained. Traditional methods for such problems rely on general-purpose sampling techniques such as Markov chain Monte Carlo, which typically requires manual intervention for tuning parameters and does not work efficiently in practice. Instead, SAGD makes use of the Langevin algorithm to construct stochastic gradients that are biased in finite steps but accurate asymptotically, enabling us to theoretically establish the convergence guarantee for SAGD. Inspired by our theoretical analysis, we also provide useful guidelines for its practical implementation. Finally, we show that SAGD performs well experimentally in popular statistical and machine learning problems such as the expectation-maximization algorithm and the variational autoencoders

    CoinEM: Tuning-Free Particle-Based Variational Inference for Latent Variable Models

    Full text link
    We introduce two new particle-based algorithms for learning latent variable models via marginal maximum likelihood estimation, including one which is entirely tuning-free. Our methods are based on the perspective of marginal maximum likelihood estimation as an optimization problem: namely, as the minimization of a free energy functional. One way to solve this problem is to consider the discretization of a gradient flow associated with the free energy. We study one such approach, which resembles an extension of the popular Stein variational gradient descent algorithm. In particular, we establish a descent lemma for this algorithm, which guarantees that the free energy decreases at each iteration. This method, and any other obtained as the discretization of the gradient flow, will necessarily depend on a learning rate which must be carefully tuned by the practitioner in order to ensure convergence at a suitable rate. With this in mind, we also propose another algorithm for optimizing the free energy which is entirely learning rate free, based on coin betting techniques from convex optimization. We validate the performance of our algorithms across a broad range of numerical experiments, including several high-dimensional settings. Our results are competitive with existing particle-based methods, without the need for any hyperparameter tuning

    Scalable particle-based alternatives to EM

    Get PDF
    (Neal and Hinton, 1998) recast the problem tackled by EM as the minimization of a free energy functional FF on an infinite-dimensional space and EM itself as coordinate descent applied to FF. Here, we explore alternative ways to optimize the functional. In particular, we identify various gradient flows associated with FF and show that their limits coincide with FF's stationary points. By discretizing the flows, we obtain three practical particle-based algorithms for maximum likelihood estimation in broad classes of latent variable models. The novel algorithms scale well to high-dimensional settings and outperform existing state-of-the-art methods in experiments

    CoinEM:Tuning-Free Particle-Based Variational Inference for Latent Variable Models

    Get PDF
    We introduce two new particle-based algorithms for learning latent variable models via marginal maximum likelihood estimation, including one which is entirely tuning-free. Our methods are based on the perspective of marginal maximum likelihood estimation as an optimization problem: namely, as the minimization of a free energy functional. One way to solve this problem is to consider the discretization of a gradient flow associated with the free energy. We study one such approach, which resembles an extension of the popular Stein variational gradient descent algorithm. In particular, we establish a descent lemma for this algorithm, which guarantees that the free energy decreases at each iteration. This method, and any other obtained as the discretization of the gradient flow, will necessarily depend on a learning rate which must be carefully tuned by the practitioner in order to ensure convergence at a suitable rate. With this in mind, we also propose another algorithm for optimizing the free energy which is entirely learning rate free, based on coin betting techniques from convex optimization. We validate the performance of our algorithms across a broad range of numerical experiments, including several high-dimensional settings. Our results are competitive with existing particle-based methods, without the need for any hyperparameter tuning

    Stochastic Approximate Gradient Descent via the Langevin Algorithm

    No full text
    corecore