1,819 research outputs found
SCOPE: Scalable Composite Optimization for Learning on Spark
Many machine learning models, such as logistic regression~(LR) and support
vector machine~(SVM), can be formulated as composite optimization problems.
Recently, many distributed stochastic optimization~(DSO) methods have been
proposed to solve the large-scale composite optimization problems, which have
shown better performance than traditional batch methods. However, most of these
DSO methods are not scalable enough. In this paper, we propose a novel DSO
method, called \underline{s}calable \underline{c}omposite
\underline{op}timization for l\underline{e}arning~({SCOPE}), and implement it
on the fault-tolerant distributed platform \mbox{Spark}. SCOPE is both
computation-efficient and communication-efficient. Theoretical analysis shows
that SCOPE is convergent with linear convergence rate when the objective
function is convex. Furthermore, empirical results on real datasets show that
SCOPE can outperform other state-of-the-art distributed learning methods on
Spark, including both batch learning methods and DSO methods
Forward-backward truncated Newton methods for convex composite optimization
This paper proposes two proximal Newton-CG methods for convex nonsmooth
optimization problems in composite form. The algorithms are based on a a
reformulation of the original nonsmooth problem as the unconstrained
minimization of a continuously differentiable function, namely the
forward-backward envelope (FBE). The first algorithm is based on a standard
line search strategy, whereas the second one combines the global efficiency
estimates of the corresponding first-order methods, while achieving fast
asymptotic convergence rates. Furthermore, they are computationally attractive
since each Newton iteration requires the approximate solution of a linear
system of usually small dimension
Global convergence of splitting methods for nonconvex composite optimization
We consider the problem of minimizing the sum of a smooth function with a
bounded Hessian, and a nonsmooth function. We assume that the latter function
is a composition of a proper closed function and a surjective linear map
, with the proximal mappings of , , simple to
compute. This problem is nonconvex in general and encompasses many important
applications in engineering and machine learning. In this paper, we examined
two types of splitting methods for solving this nonconvex optimization problem:
alternating direction method of multipliers and proximal gradient algorithm.
For the direct adaptation of the alternating direction method of multipliers,
we show that, if the penalty parameter is chosen sufficiently large and the
sequence generated has a cluster point, then it gives a stationary point of the
nonconvex problem. We also establish convergence of the whole sequence under an
additional assumption that the functions and are semi-algebraic.
Furthermore, we give simple sufficient conditions to guarantee boundedness of
the sequence generated. These conditions can be satisfied for a wide range of
applications including the least squares problem with the
regularization. Finally, when is the identity so that the proximal
gradient algorithm can be efficiently applied, we show that any cluster point
is stationary under a slightly more flexible constant step-size rule than what
is known in the literature for a nonconvex .Comment: To appear in SIOP
The Extended Regularized Dual Averaging Method for Composite Optimization
We present a new algorithm, extended regularized dual averaging (XRDA), for
solving composite optimization problems, which are a generalization of the
regularized dual averaging (RDA) method. The main novelty of the method is that
it allows more flexible control of the backward step size. For instance, the
backward step size for RDA grows without bound, while XRDA the backward step
size can be kept bounded
- …