8 research outputs found
An accelerated first-order method with complexity analysis for solving cubic regularization subproblems
We propose a first-order method to solve the cubic regularization subproblem
(CRS) based on a novel reformulation. The reformulation is a constrained convex
optimization problem whose feasible region admits an easily computable
projection. Our reformulation requires computing the minimum eigenvalue of the
Hessian. To avoid the expensive computation of the exact minimum eigenvalue, we
develop a surrogate problem to the reformulation where the exact minimum
eigenvalue is replaced with an approximate one. We then apply first-order
methods such as the Nesterov's accelerated projected gradient method (APG) and
projected Barzilai-Borwein method to solve the surrogate problem. As our main
theoretical contribution, we show that when an -approximate minimum
eigenvalue is computed by the Lanczos method and the surrogate problem is
approximately solved by APG, our approach returns an -approximate
solution to CRS in matrix-vector multiplications
(where hides the logarithmic factors). Numerical experiments
show that our methods are comparable to and outperform the Krylov subspace
method in the easy and hard cases, respectively. We further implement our
methods as subproblem solvers of adaptive cubic regularization methods, and
numerical results show that our algorithms are comparable to the
state-of-the-art algorithms
A Stochastic Tensor Method for Non-convex Optimization
We present a stochastic optimization method that uses a fourth-order
regularized model to find local minima of smooth and potentially non-convex
objective functions with a finite-sum structure. This algorithm uses
sub-sampled derivatives instead of exact quantities. The proposed approach is
shown to find an -third-order critical
point in at most \bigO\left(\max\left(\epsilon_1^{-4/3}, \epsilon_2^{-2},
\epsilon_3^{-4}\right)\right) iterations, thereby matching the rate of
deterministic approaches. In order to prove this result, we derive a novel
tensor concentration inequality for sums of tensors of any order that makes
explicit use of the finite-sum structure of the objective function