11 research outputs found

    Kinetic energy choice in Hamiltonian/hybrid Monte Carlo

    Full text link
    We consider how different choices of kinetic energy in Hamiltonian Monte Carlo affect algorithm performance. To this end, we introduce two quantities which can be easily evaluated, the composite gradient and the implicit noise. Results are established on integrator stability and geometric convergence, and we show that choices of kinetic energy that result in heavy-tailed momentum distributions can exhibit an undesirable negligible moves property, which we define. A general efficiency-robustness trade off is outlined, and implementations which rely on approximate gradients are also discussed. Two numerical studies illustrate our theoretical findings, showing that the standard choice which results in a Gaussian momentum distribution is not always optimal in terms of either robustness or efficiency.Comment: 15 pages (+7 page supplement, included here as an appendix), 2 figures (+1 in supplement

    On explicit L2L^2-convergence rate estimate for underdamped Langevin dynamics

    Full text link
    We provide a new explicit estimate of exponential decay rate of underdamped Langevin dynamics in L2L^2 distance. To achieve this, we first prove a Poincar\'{e}-type inequality with Gibbs measure in space and Gaussian measure in momentum. Our new estimate provides a more explicit and simpler expression of decay rate; moreover, when the potential is convex with Poincar\'{e} constant m1m \ll 1, our new estimate offers the decay rate of O(m)\mathcal{O}(\sqrt{m}) after optimizing the choice of friction coefficient, which is much faster compared to O(m)\mathcal{O}(m) for the overdamped Langevin dynamics.Comment: We have fixed the bug

    Friction-adaptive descent: a family of dynamics-based optimization methods

    Full text link
    We describe a family of descent algorithms which generalizes common existing schemes used in applications such as neural network training and more broadly for optimization of smooth functions--potentially for global optimization, or as a local optimization method to be deployed within global optimization schemes like basin hopping. By introducing an auxiliary degree of freedom we create a dynamical system with improved stability, reducing oscillatory modes and accelerating convergence to minima. The resulting algorithms are simple to implement and control, and convergence can be shown directly by Lyapunov's second method. Although this framework, which we refer to as friction-adaptive descent (FAD), is fairly general, we focus most of our attention here on a specific variant: kinetic energy stabilization (which can be viewed as a zero-temperature Nos\'e--Hoover scheme but with added dissipation in both physical and auxiliary variables), termed KFAD (kinetic FAD). To illustrate the flexibility of the FAD framework we consider several other methods. in certain asymptotic limits, these methods can be viewed as introducing cubic damping in various forms; they can be more efficient than linearly dissipated Hamiltonian dynamics in common optimization settings. We present details of the numerical methods and show convergence for both the continuous and discretized dynamics in the convex setting by constructing Lyapunov functions. The methods are tested using a toy model (the Rosenbrock function). We also demonstrate the methods for structural optimization for atomic clusters in Lennard--Jones and Morse potentials. The experiments show the relative efficiency and robustness of FAD in comparison to linearly dissipated Hamiltonian dynamics

    Optimal friction matrix for underdamped Langevin sampling

    Get PDF
    A systematic procedure for optimising the friction coefficient in underdamped Langevin dynamics as a sampling tool is given by taking the gradient of the associated asymptotic variance with respect to friction. We give an expression for this gradient in terms of the solution to an appropriate Poisson equation and show that it can be approximated by short simulations of the associated first variation/tangent process under concavity assumptions on the log density. Our algorithm is applied to the estimation of posterior means in Bayesian inference problems and reduced variance is demonstrated when compared to the original underdamped and overdamped Langevin dynamics in both full and stochastic gradient cases

    Unbiasing Hamiltonian Monte Carlo algorithms for a general Hamiltonian function

    Full text link
    Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo method that allows to sample high dimensional probability measures. It relies on the integration of the Hamiltonian dynamics to propose a move which is then accepted or rejected thanks to a Metropolis procedure. Unbiased sampling is guaranteed by the preservation by the numerical integrators of two key properties of the Hamiltonian dynamics: volume-preservation and reversibility up to momentum reversal. For separable Hamiltonian functions, some standard explicit numerical schemes, such as the St\"ormer--Verlet integrator, satisfy these properties. However, for numerical or physical reasons, one may consider a Hamiltonian function which is nonseparable, in which case the standard numerical schemes which preserve the volume and satisfy reversibility up to momentum reversal are implicit. Actually, when implemented in practice, such implicit schemes may admit many solutions or none, especially when the timestep is too large. We show here how to enforce the numerical reversibility, and thus unbiasedness, of HMC schemes in this context. Numerical results illustrate the relevance of this correction on simple problems.Comment: 62 pages, 8 figure

    Convergence and variance reduction for stochastic differential equations in sampling and optimisation

    Get PDF
    Three problems that are linked by way of motivation are addressed in this work. In the first part of the thesis, we study the generalised Langevin equation for simulated annealing with the underlying goal of improving continuous-time dynamics for the problem of global optimisation of nonconvex functions. The main result in this part is on the convergence to the global optimum, which is shown using techniques from hypocoercivity given suitable assumptions on the nonconvex function. Alongside, we investigate numerically the problem of parameter tuning in the continuous-time equation. In the second part of the thesis, this last problem is addressed rigorously for the underdamped Langevin dynamics. In particular, a systematic procedure for finding the optimal friction matrix in the sampling problem is presented. We give an expression for the gradient of the asymptotic variance in terms of solutions to Poisson equations and present a working algorithm for approximating its value. Lastly, regularity of an associated semigroup, twice differentiable-in-space solutions to the Kolmogorov equation and weak numerical convergence rates of order one are shown for a class of stochastic differential equations with superlinearly growing, non-globally monotone coefficients. In the relation to the previous part, the results allow the use of Poisson equations for variations of Langevin dynamics not permissible before.Open Acces
    corecore