Search CORE

5,445 research outputs found

LSOS: Line-search Second-Order Stochastic optimization methods for nonconvex finite sums

Author: di Serafino Daniela
Jerinkić Nataša Krklec
Krejić Nataša
Viola Marco
Publication venue
Publication date: 10/06/2021
Field of study

We develop a line-search second-order algorithmic framework for minimizing finite sums. We do not make any convexity assumptions, but require the terms of the sum to be continuously differentiable and have Lipschitz-continuous gradients. The methods fitting into this framework combine line searches and suitably decaying step lengths. A key issue is a two-step sampling at each iteration, which allows us to control the error present in the line-search procedure. Stationarity of limit points is proved in the almost-sure sense, while almost-sure convergence of the sequence of approximations to the solution holds with the additional hypothesis that the functions are strongly convex. Numerical experiments, including comparisons with state-of-the art stochastic optimization methods, show the efficiency of our approach.Comment: 22 pages, 4 figure

arXiv.org e-Print Archive

Bolstering Stochastic Gradient Descent with Model Building

Author: Birbil S. Ilker
Martin Ozgur
Onay Gonenc
Oztoprak Figen
Publication venue
Publication date: 15/02/2023
Field of study

Stochastic gradient descent method and its variants constitute the core optimization algorithms that achieve good convergence rates for solving machine learning problems. These rates are obtained especially when these algorithms are fine-tuned for the application at hand. Although this tuning process can require large computational costs, recent work has shown that these costs can be reduced by line search methods that iteratively adjust the stepsize. We propose an alternative approach to stochastic line search by using a new algorithm based on forward step model building. This model building step incorporates second-order information that allows adjusting not only the stepsize but also the search direction. Noting that deep learning model parameters come in groups (layers of tensors), our method builds its model and calculates a new step for each parameter group. This novel diagonalization approach makes the selected step lengths adaptive. We provide convergence rate analysis, and experimentally show that the proposed algorithm achieves faster convergence and better generalization in well-known test problems. More precisely, SMB requires less tuning, and shows comparable performance to other adaptive methods

arXiv.org e-Print Archive

Stochastic quasi-Newton molecular simulations

Author: Chau C.D.
Fraaije J.G.E.M.
Sevink G.J.A.
Publication venue
Publication date: 01/01/2010
Field of study

Article / Letter to editorLeiden Institute of Chemistr

Crossref

Leiden University Scholary Publications