5,445 research outputs found
LSOS: Line-search Second-Order Stochastic optimization methods for nonconvex finite sums
We develop a line-search second-order algorithmic framework for minimizing
finite sums. We do not make any convexity assumptions, but require the terms of
the sum to be continuously differentiable and have Lipschitz-continuous
gradients. The methods fitting into this framework combine line searches and
suitably decaying step lengths. A key issue is a two-step sampling at each
iteration, which allows us to control the error present in the line-search
procedure. Stationarity of limit points is proved in the almost-sure sense,
while almost-sure convergence of the sequence of approximations to the solution
holds with the additional hypothesis that the functions are strongly convex.
Numerical experiments, including comparisons with state-of-the art stochastic
optimization methods, show the efficiency of our approach.Comment: 22 pages, 4 figure
Bolstering Stochastic Gradient Descent with Model Building
Stochastic gradient descent method and its variants constitute the core
optimization algorithms that achieve good convergence rates for solving machine
learning problems. These rates are obtained especially when these algorithms
are fine-tuned for the application at hand. Although this tuning process can
require large computational costs, recent work has shown that these costs can
be reduced by line search methods that iteratively adjust the stepsize. We
propose an alternative approach to stochastic line search by using a new
algorithm based on forward step model building. This model building step
incorporates second-order information that allows adjusting not only the
stepsize but also the search direction. Noting that deep learning model
parameters come in groups (layers of tensors), our method builds its model and
calculates a new step for each parameter group. This novel diagonalization
approach makes the selected step lengths adaptive. We provide convergence rate
analysis, and experimentally show that the proposed algorithm achieves faster
convergence and better generalization in well-known test problems. More
precisely, SMB requires less tuning, and shows comparable performance to other
adaptive methods
Stochastic quasi-Newton molecular simulations
Article / Letter to editorLeiden Institute of Chemistr
- …