45,213 research outputs found
An Accelerated Proximal Coordinate Gradient Method and its Application to Regularized Empirical Risk Minimization
We consider the problem of minimizing the sum of two convex functions: one is
smooth and given by a gradient oracle, and the other is separable over blocks
of coordinates and has a simple known structure over each block. We develop an
accelerated randomized proximal coordinate gradient (APCG) method for
minimizing such convex composite functions. For strongly convex functions, our
method achieves faster linear convergence rates than existing randomized
proximal coordinate gradient methods. Without strong convexity, our method
enjoys accelerated sublinear convergence rates. We show how to apply the APCG
method to solve the regularized empirical risk minimization (ERM) problem, and
devise efficient implementations that avoid full-dimensional vector operations.
For ill-conditioned ERM problems, our method obtains improved convergence rates
than the state-of-the-art stochastic dual coordinate ascent (SDCA) method
Catalyst Acceleration for Gradient-Based Non-Convex Optimization
We introduce a generic scheme to solve nonconvex optimization problems using
gradient-based algorithms originally designed for minimizing convex functions.
Even though these methods may originally require convexity to operate, the
proposed approach allows one to use them on weakly convex objectives, which
covers a large class of non-convex functions typically appearing in machine
learning and signal processing. In general, the scheme is guaranteed to produce
a stationary point with a worst-case efficiency typical of first-order methods,
and when the objective turns out to be convex, it automatically accelerates in
the sense of Nesterov and achieves near-optimal convergence rate in function
values. These properties are achieved without assuming any knowledge about the
convexity of the objective, by automatically adapting to the unknown weak
convexity constant. We conclude the paper by showing promising experimental
results obtained by applying our approach to incremental algorithms such as
SVRG and SAGA for sparse matrix factorization and for learning neural networks
- …