20 research outputs found

    Coin Sampling: Gradient-Based Bayesian Inference without Learning Rates

    Full text link
    In recent years, particle-based variational inference (ParVI) methods such as Stein variational gradient descent (SVGD) have grown in popularity as scalable methods for Bayesian inference. Unfortunately, the properties of such methods invariably depend on hyperparameters such as the learning rate, which must be carefully tuned by the practitioner in order to ensure convergence to the target measure at a suitable rate. In this paper, we introduce a suite of new particle-based methods for scalable Bayesian inference based on coin betting, which are entirely learning-rate free. We illustrate the performance of our approach on a range of numerical examples, including several high-dimensional models and datasets, demonstrating comparable performance to other ParVI algorithms with no need to tune a learning rate.Comment: ICML 202

    Learning Rate Free Bayesian Inference in Constrained Domains

    Full text link
    We introduce a suite of new particle-based algorithms for sampling on constrained domains which are entirely learning rate free. Our approach leverages coin betting ideas from convex optimisation, and the viewpoint of constrained sampling as a mirrored optimisation problem on the space of probability measures. Based on this viewpoint, we also introduce a unifying framework for several existing constrained sampling algorithms, including mirrored Langevin dynamics and mirrored Stein variational gradient descent. We demonstrate the performance of our algorithms on a range of numerical examples, including sampling from targets on the simplex, sampling with fairness constraints, and constrained sampling problems in post-selection inference. Our results indicate that our algorithms achieve competitive performance with existing constrained sampling methods, without the need to tune any hyperparameters

    CoinEM: Tuning-Free Particle-Based Variational Inference for Latent Variable Models

    Full text link
    We introduce two new particle-based algorithms for learning latent variable models via marginal maximum likelihood estimation, including one which is entirely tuning-free. Our methods are based on the perspective of marginal maximum likelihood estimation as an optimization problem: namely, as the minimization of a free energy functional. One way to solve this problem is to consider the discretization of a gradient flow associated with the free energy. We study one such approach, which resembles an extension of the popular Stein variational gradient descent algorithm. In particular, we establish a descent lemma for this algorithm, which guarantees that the free energy decreases at each iteration. This method, and any other obtained as the discretization of the gradient flow, will necessarily depend on a learning rate which must be carefully tuned by the practitioner in order to ensure convergence at a suitable rate. With this in mind, we also propose another algorithm for optimizing the free energy which is entirely learning rate free, based on coin betting techniques from convex optimization. We validate the performance of our algorithms across a broad range of numerical experiments, including several high-dimensional settings. Our results are competitive with existing particle-based methods, without the need for any hyperparameter tuning

    Learning Rate Free Bayesian Inference in Constrained Domains

    Get PDF
    We introduce a suite of new particle-based algorithms for sampling on constrained domains which are entirely learning rate free. Our approach leverages coin betting ideas from convex optimisation, and the viewpoint of constrained sampling as a mirrored optimisation problem on the space of probability measures. Based on this viewpoint, we also introduce a unifying framework for several existing constrained sampling algorithms, including mirrored Langevin dynamics and mirrored Stein variational gradient descent. We demonstrate the performance of our algorithms on a range of numerical examples, including sampling from targets on the simplex, sampling with fairness constraints, and constrained sampling problems in post-selection inference. Our results indicate that our algorithms achieve competitive performance with existing constrained sampling methods, without the need to tune any hyperparameters

    CoinEM:Tuning-Free Particle-Based Variational Inference for Latent Variable Models

    Get PDF
    We introduce two new particle-based algorithms for learning latent variable models via marginal maximum likelihood estimation, including one which is entirely tuning-free. Our methods are based on the perspective of marginal maximum likelihood estimation as an optimization problem: namely, as the minimization of a free energy functional. One way to solve this problem is to consider the discretization of a gradient flow associated with the free energy. We study one such approach, which resembles an extension of the popular Stein variational gradient descent algorithm. In particular, we establish a descent lemma for this algorithm, which guarantees that the free energy decreases at each iteration. This method, and any other obtained as the discretization of the gradient flow, will necessarily depend on a learning rate which must be carefully tuned by the practitioner in order to ensure convergence at a suitable rate. With this in mind, we also propose another algorithm for optimizing the free energy which is entirely learning rate free, based on coin betting techniques from convex optimization. We validate the performance of our algorithms across a broad range of numerical experiments, including several high-dimensional settings. Our results are competitive with existing particle-based methods, without the need for any hyperparameter tuning

    Learning Rate Free Sampling in Constrained Domains

    Get PDF
    We introduce a suite of new particle-based algorithms for sampling in constrained domains which are entirely learning rate free. Our approach leverages coin betting ideas from convex optimisation, and the viewpoint of constrained sampling as a mirrored optimisation problem on the space of probability measures. Based on this viewpoint, we also introduce a unifying framework for several existing constrained sampling algorithms, including mirrored Langevin dynamics and mirrored Stein variational gradient descent. We demonstrate the performance of our algorithms on a range of numerical examples, including sampling from targets on the simplex, sampling with fairness constraints, and constrained sampling problems in post-selection inference. Our results indicate that our algorithms achieve competitive performance with existing constrained sampling methods, without the need to tune any hyperparameters

    Two-Timescale Stochastic Approximation for Bilevel Optimisation Problems in Continuous-Time Models

    Full text link
    We analyse the asymptotic properties of a continuous-time, two-timescale stochastic approximation algorithm designed for stochastic bilevel optimisation problems in continuous-time models. We obtain the weak convergence rate of this algorithm in the form of a central limit theorem. We also demonstrate how this algorithm can be applied to several continuous-time bilevel optimisation problems.Comment: Accepted at ICML 2022 Workshop on Continuous Time Methods in Machine Learnin

    On the theory and applications of stochastic gradient descent in continuous time

    Get PDF
    Stochastic optimisation problems are ubiquitous across machine learning, engineering, the natural sciences, economics, and operational research. One of the most popular and widely used methods for solving such problems is stochastic gradient descent. In this thesis, we study the theoretical properties and the applications of stochastic gradient descent in continuous time. We begin by analysing the asymptotic properties of two-timescale stochastic gradient descent in continuous time, extending well known results in discrete time. The proposed algorithm, which arises naturally in the context of stochastic bilevel optimisation problems, consists of two coupled stochastic recursions which evolve on different timescales. Under weak and classical assumptions, we establish the almost sure convergence of this algorithm, and obtain an asymptotic convergence rate. We next illustrate how the proposed algorithm can be applied to an important problem arising in continuous-time state-space models: joint online parameter estimation and optimal sensor placement. Under suitable conditions on the process consisting of the latent signal process, the filter, and the filter derivatives, we establish almost sure convergence of the online parameter estimates and optimal sensor placements generated by our algorithm to the stationary points of the asymptotic log-likelihood of the observations, and the asymptotic covariance of the state estimate, respectively. We also provide extensive numerical results illustrating the performance of our approach in the case that the hidden signal is governed by the two-dimensional stochastic advection-diffusion equation, a model arising in many meteorological and environmental monitoring applications. In the final part of this thesis, we introduce a continuous-time stochastic gradient descent algorithm for recursive estimation of the parameters of a stochastic McKean-Vlasov equation equation, and the associated system of interacting particles. Such models arise in a variety of applications, including statistical physics, mathematical biology, and the social sciences. We prove that our estimator converges in L1 to the stationary points of the asymptotic log-likelihood of the McKean-Vlasov SDE in the joint limit as t and the number of particles N go to infinity, under suitable conditions which guarantee ergodicity and uniform-in-time propagation of chaos. We also establish, assuming also strong concavity for the asymptotic log-likelihood, an L2 convergence rate to the unique maximiser of this asymptotic log-likelihood function. Our theoretical results are demonstrated via a range of numerical examples, including a stochastic Kuramoto model and a stochastic opinion dynamics model.Open Acces

    Joint Online Parameter Estimation and Optimal Sensor Placement for the Partially Observed Stochastic Advection-Diffusion Equation

    No full text
    In this paper, we consider the problem of jointly performing online parameter estimation and optimal sensor placement for a partially observed infinite dimensional linear diffusion process. We present a novel solution to this problem in the form of a continuous-time, two-timescale stochastic gradient descent algorithm, which recursively seeks to maximise the log-likelihood with respect to the unknown model parameters, and to minimise the expected mean squared error of the hidden state estimate with respect to the sensor locations. We also provide extensive numerical results illustrating the performance of the proposed approach in the case that the hidden signal is governed by the two-dimensional stochastic advection-diffusion equation

    Coin Sampling:Gradient-Based Bayesian Inference without Learning Rates

    Get PDF
    In recent years, particle-based variational inference (ParVI) methods such as Stein variational gradient descent (SVGD) have grown in popularity as scalable methods for Bayesian inference. Unfortunately, the properties of such methods invariably depend on hyperparameters such as the learning rate, which must be carefully tuned by the practitioner in order to ensure convergence to the target measure at a suitable rate. In this paper, we introduce a suite of new particle-based methods for scalable Bayesian inference based on coin betting, which are entirely learning-rate free. We illustrate the performance of our approach on a range of numerical examples, including several high-dimensional models and datasets, demonstrating comparable performance to other ParVI algorithms with no need to tune a learning rate
    corecore