3 research outputs found
A stochastic first-order trust-region method with inexact restoration for finite-sum minimization
We propose a stochastic first-order trust-region method with inexact function
and gradient evaluations for solving finite-sum minimization problems. At each
iteration, the function and the gradient are approximated by sampling. The
sample size in gradient approximations is smaller than the sample size in
function approximations and the latter is determined using a deterministic rule
inspired by the inexact restoration method, which allows the decrease of the
sample size at some iterations. The trust-region step is then either accepted
or rejected using a suitable merit function, which combines the function
estimate with a measure of accuracy in the evaluation. We show that the
proposed method eventually reaches full precision in evaluating the objective
function and we provide a worst-case complexity result on the number of
iterations required to achieve full precision. We validate the proposed
algorithm on nonconvex binary classification problems showing good performance
in terms of cost and accuracy and the important feature that a burdensome
tuning of the parameters involved is not required
Stochastic trust region inexact Newton method for large-scale machine learning
Nowadays stochastic approximation methods are one of the major research direction to deal with the large-scale machine learning problems. From stochastic first order methods, now the focus is shifting to stochastic second order methods due to their faster convergence and availability of computing resources. In this paper, we have proposed a novel Stochastic Trust RegiOn Inexact Newton method, called as STRON, to solve large-scale learning problems which uses conjugate gradient (CG) to inexactly solve trust region subproblem. The method uses progressive subsampling in the calculation of gradient and Hessian values to take the advantage of both, stochastic and full-batch regimes. We have extended STRON using existing variance reduction techniques todeal with the noisy gradients and using preconditioned conjugate gradient (PCG) as subproblem solver, and empirically proved that they do not work as expected, for the large-scale learning problems. Finally, our empirical results prove efficacy of the proposed method against existing methods with bench marked datasets