11 research outputs found
Kinetic energy choice in Hamiltonian/hybrid Monte Carlo
We consider how different choices of kinetic energy in Hamiltonian Monte
Carlo affect algorithm performance. To this end, we introduce two quantities
which can be easily evaluated, the composite gradient and the implicit noise.
Results are established on integrator stability and geometric convergence, and
we show that choices of kinetic energy that result in heavy-tailed momentum
distributions can exhibit an undesirable negligible moves property, which we
define. A general efficiency-robustness trade off is outlined, and
implementations which rely on approximate gradients are also discussed. Two
numerical studies illustrate our theoretical findings, showing that the
standard choice which results in a Gaussian momentum distribution is not always
optimal in terms of either robustness or efficiency.Comment: 15 pages (+7 page supplement, included here as an appendix), 2
figures (+1 in supplement
On explicit -convergence rate estimate for underdamped Langevin dynamics
We provide a new explicit estimate of exponential decay rate of underdamped
Langevin dynamics in distance. To achieve this, we first prove a
Poincar\'{e}-type inequality with Gibbs measure in space and Gaussian measure
in momentum. Our new estimate provides a more explicit and simpler expression
of decay rate; moreover, when the potential is convex with Poincar\'{e}
constant , our new estimate offers the decay rate of
after optimizing the choice of friction coefficient,
which is much faster compared to for the overdamped Langevin
dynamics.Comment: We have fixed the bug
Friction-adaptive descent: a family of dynamics-based optimization methods
We describe a family of descent algorithms which generalizes common existing
schemes used in applications such as neural network training and more broadly
for optimization of smooth functions--potentially for global optimization, or
as a local optimization method to be deployed within global optimization
schemes like basin hopping. By introducing an auxiliary degree of freedom we
create a dynamical system with improved stability, reducing oscillatory modes
and accelerating convergence to minima. The resulting algorithms are simple to
implement and control, and convergence can be shown directly by Lyapunov's
second method.
Although this framework, which we refer to as friction-adaptive descent
(FAD), is fairly general, we focus most of our attention here on a specific
variant: kinetic energy stabilization (which can be viewed as a
zero-temperature Nos\'e--Hoover scheme but with added dissipation in both
physical and auxiliary variables), termed KFAD (kinetic FAD). To illustrate the
flexibility of the FAD framework we consider several other methods. in certain
asymptotic limits, these methods can be viewed as introducing cubic damping in
various forms; they can be more efficient than linearly dissipated Hamiltonian
dynamics in common optimization settings.
We present details of the numerical methods and show convergence for both the
continuous and discretized dynamics in the convex setting by constructing
Lyapunov functions. The methods are tested using a toy model (the Rosenbrock
function). We also demonstrate the methods for structural optimization for
atomic clusters in Lennard--Jones and Morse potentials. The experiments show
the relative efficiency and robustness of FAD in comparison to linearly
dissipated Hamiltonian dynamics
Optimal friction matrix for underdamped Langevin sampling
A systematic procedure for optimising the friction coefficient in underdamped Langevin dynamics as a sampling tool is given by taking the gradient of the associated asymptotic variance with respect to friction. We give an expression for this gradient in terms of the solution to an appropriate Poisson equation and show that it can be approximated by short simulations of the associated first variation/tangent process under concavity assumptions on the log density. Our algorithm is applied to the estimation of posterior means in Bayesian inference problems and reduced variance is demonstrated when compared to the original underdamped and overdamped Langevin dynamics in both full and stochastic gradient cases
Unbiasing Hamiltonian Monte Carlo algorithms for a general Hamiltonian function
Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo method that
allows to sample high dimensional probability measures. It relies on the
integration of the Hamiltonian dynamics to propose a move which is then
accepted or rejected thanks to a Metropolis procedure. Unbiased sampling is
guaranteed by the preservation by the numerical integrators of two key
properties of the Hamiltonian dynamics: volume-preservation and reversibility
up to momentum reversal. For separable Hamiltonian functions, some standard
explicit numerical schemes, such as the St\"ormer--Verlet integrator, satisfy
these properties. However, for numerical or physical reasons, one may consider
a Hamiltonian function which is nonseparable, in which case the standard
numerical schemes which preserve the volume and satisfy reversibility up to
momentum reversal are implicit. Actually, when implemented in practice, such
implicit schemes may admit many solutions or none, especially when the timestep
is too large. We show here how to enforce the numerical reversibility, and thus
unbiasedness, of HMC schemes in this context. Numerical results illustrate the
relevance of this correction on simple problems.Comment: 62 pages, 8 figure
Convergence and variance reduction for stochastic differential equations in sampling and optimisation
Three problems that are linked by way of motivation are addressed in this work.
In the first part of the thesis, we study the generalised Langevin equation for simulated
annealing with the underlying goal of improving continuous-time dynamics for the problem of global optimisation of nonconvex functions. The main result in this part is on
the convergence to the global optimum, which is shown using techniques from hypocoercivity given suitable assumptions on the nonconvex function. Alongside, we investigate
numerically the problem of parameter tuning in the continuous-time equation.
In the second part of the thesis, this last problem is addressed rigorously for the underdamped Langevin dynamics. In particular, a systematic procedure for finding the optimal
friction matrix in the sampling problem is presented. We give an expression for the gradient of the asymptotic variance in terms of solutions to Poisson equations and present a
working algorithm for approximating its value.
Lastly, regularity of an associated semigroup, twice differentiable-in-space solutions to
the Kolmogorov equation and weak numerical convergence rates of order one are shown
for a class of stochastic differential equations with superlinearly growing, non-globally
monotone coefficients. In the relation to the previous part, the results allow the use of
Poisson equations for variations of Langevin dynamics not permissible before.Open Acces