403 research outputs found
Stochastic Parallel Block Coordinate Descent for Large-scale Saddle Point Problems
We consider convex-concave saddle point problems with a separable structure
and non-strongly convex functions. We propose an efficient stochastic block
coordinate descent method using adaptive primal-dual updates, which enables
flexible parallel optimization for large-scale problems. Our method shares the
efficiency and flexibility of block coordinate descent methods with the
simplicity of primal-dual methods and utilizing the structure of the separable
convex-concave saddle point problem. It is capable of solving a wide range of
machine learning applications, including robust principal component analysis,
Lasso, and feature selection by group Lasso, etc. Theoretically and
empirically, we demonstrate significantly better performance than
state-of-the-art methods in all these applications.Comment: Accepted by AAAI 201
Adaptive Stochastic Primal-Dual Coordinate Descent for Separable Saddle Point Problems
We consider a generic convex-concave saddle point problem with separable
structure, a form that covers a wide-ranged machine learning applications.
Under this problem structure, we follow the framework of primal-dual updates
for saddle point problems, and incorporate stochastic block coordinate descent
with adaptive stepsize into this framework. We theoretically show that our
proposal of adaptive stepsize potentially achieves a sharper linear convergence
rate compared with the existing methods. Additionally, since we can select
"mini-batch" of block coordinates to update, our method is also amenable to
parallel processing for large-scale data. We apply the proposed method to
regularized empirical risk minimization and show that it performs comparably
or, more often, better than state-of-the-art methods on both synthetic and
real-world data sets.Comment: Accepted by ECML/PKDD201
Block-proximal methods with spatially adapted acceleration
We study and develop (stochastic) primal--dual block-coordinate descent
methods for convex problems based on the method due to Chambolle and Pock. Our
methods have known convergence rates for the iterates and the ergodic gap:
if each block is strongly convex, if no convexity is
present, and more generally a mixed rate for strongly convex
blocks, if only some blocks are strongly convex. Additional novelties of our
methods include blockwise-adapted step lengths and acceleration, as well as the
ability to update both the primal and dual variables randomly in blocks under a
very light compatibility condition. In other words, these variants of our
methods are doubly-stochastic. We test the proposed methods on various image
processing problems, where we employ pixelwise-adapted acceleration
Stochastic Variance Reduction Methods for Saddle-Point Problems
We consider convex-concave saddle-point problems where the objective
functions may be split in many components, and extend recent stochastic
variance reduction methods (such as SVRG or SAGA) to provide the first
large-scale linearly convergent algorithms for this class of problems which is
common in machine learning. While the algorithmic extension is straightforward,
it comes with challenges and opportunities: (a) the convex minimization
analysis does not apply and we use the notion of monotone operators to prove
convergence, showing in particular that the same algorithm applies to a larger
class of problems, such as variational inequalities, (b) there are two notions
of splits, in terms of functions, or in terms of partial derivatives, (c) the
split does need to be done with convex-concave terms, (d) non-uniform sampling
is key to an efficient algorithm, both in theory and practice, and (e) these
incremental algorithms can be easily accelerated using a simple extension of
the "catalyst" framework, leading to an algorithm which is always superior to
accelerated batch algorithms.Comment: Neural Information Processing Systems (NIPS), 2016, Barcelona, Spai
Iteration Complexity of Randomized Primal-Dual Methods for Convex-Concave Saddle Point Problems
In this paper we propose a class of randomized primal-dual methods to contend
with large-scale saddle point problems defined by a convex-concave function
. We analyze the convergence rate of the
proposed method under the settings of mere convexity and strong convexity in
-variable. In particular, assuming is
Lipschitz and is coordinate-wise Lipschitz for
any fixed , the ergodic sequence generated by the algorithm achieves the
convergence rate of in a suitable error metric where
denotes the number of coordinates for the primal variable. Furthermore,
assuming that is uniformly strongly convex for any ,
and that is linear in , the scheme displays convergence rate
of . We implemented the proposed algorithmic framework to
solve kernel matrix learning problem, and tested it against other
state-of-the-art solvers
- …