Search CORE

3 research outputs found

Conjugate Directions for Stochastic Gradient Descent

Author: B. A. Pearlmutter
D. Marquardt
G.B. Orr
K. Levenberg
M. R. Hestenes
N. N. Schraudolph
T. Graepel
Publication venue
Publication date: 01/01/2002
Field of study

The method of conjugate gradients provides a very eective way to optimize large, deterministic systems by gradient descent. In its standard form, however, it is not amenable to stochastic approximation of the gradient. Here we explore ideas from conjugate gradient in the stochastic (online) setting, using fast Hessian-gradient products to set up low-dimensional Krylov subspaces within individual mini-batches. In our benchmark experiments the resulting online learning algorithms converge orders of magnitude faster than ordinary stochastic gradient descent

CiteSeerX

Crossref

UCL Discovery