50,384 research outputs found
A stochastic proximal alternating method for non-smooth non-convex optimization
We introduce SPRING, a novel stochastic proximal alternating linearized
minimization algorithm for solving a class of non-smooth and non-convex
optimization problems. Large-scale imaging problems are becoming increasingly
prevalent due to advances in data acquisition and computational capabilities.
Motivated by the success of stochastic optimization methods, we propose a
stochastic variant of proximal alternating linearized minimization (PALM)
algorithm \cite{bolte2014proximal}. We provide global convergence guarantees,
demonstrating that our proposed method with variance-reduced stochastic
gradient estimators, such as SAGA \cite{SAGA} and SARAH \cite{sarah}, achieves
state-of-the-art oracle complexities. We also demonstrate the efficacy of our
algorithm via several numerical examples including sparse non-negative matrix
factorization, sparse principal component analysis, and blind image
deconvolution.Comment: 28 pages, 11 page appendi
Uniform exponential convergence of sample average random functions under general sampling with applications in stochastic programming
AbstractSample average approximation (SAA) is one of the most popular methods for solving stochastic optimization and equilibrium problems. Research on SAA has been mostly focused on the case when sampling is independent and identically distributed (iid) with exceptions (Dai et al. (2000) [9], Homem-de-Mello (2008) [16]). In this paper we study SAA with general sampling (including iid sampling and non-iid sampling) for solving nonsmooth stochastic optimization problems, stochastic Nash equilibrium problems and stochastic generalized equations. To this end, we first derive the uniform exponential convergence of the sample average of a class of lower semicontinuous random functions and then apply it to a nonsmooth stochastic minimization problem. Exponential convergence of estimators of both optimal solutions and M-stationary points (characterized by Mordukhovich limiting subgradients (Mordukhovich (2006) [23], Rockafellar and Wets (1998) [32])) are established under mild conditions. We also use the unform convergence result to establish the exponential rate of convergence of statistical estimators of a stochastic Nash equilibrium problem and estimators of the solutions to a stochastic generalized equation problem
Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization
Majorization-minimization algorithms consist of iteratively minimizing a
majorizing surrogate of an objective function. Because of its simplicity and
its wide applicability, this principle has been very popular in statistics and
in signal processing. In this paper, we intend to make this principle scalable.
We introduce a stochastic majorization-minimization scheme which is able to
deal with large-scale or possibly infinite data sets. When applied to convex
optimization problems under suitable assumptions, we show that it achieves an
expected convergence rate of after iterations, and of
for strongly convex functions. Equally important, our scheme almost
surely converges to stationary points for a large class of non-convex problems.
We develop several efficient algorithms based on our framework. First, we
propose a new stochastic proximal gradient method, which experimentally matches
state-of-the-art solvers for large-scale -logistic regression. Second,
we develop an online DC programming algorithm for non-convex sparse estimation.
Finally, we demonstrate the effectiveness of our approach for solving
large-scale structured matrix factorization problems.Comment: accepted for publication for Neural Information Processing Systems
(NIPS) 2013. This is the 9-pages version followed by 16 pages of appendices.
The title has changed compared to the first technical repor
Communication-Efficient Gradient Descent-Accent Methods for Distributed Variational Inequalities: Unified Analysis and Local Updates
Distributed and federated learning algorithms and techniques associated
primarily with minimization problems. However, with the increase of minimax
optimization and variational inequality problems in machine learning, the
necessity of designing efficient distributed/federated learning approaches for
these problems is becoming more apparent. In this paper, we provide a unified
convergence analysis of communication-efficient local training methods for
distributed variational inequality problems (VIPs). Our approach is based on a
general key assumption on the stochastic estimates that allows us to propose
and analyze several novel local training algorithms under a single framework
for solving a class of structured non-monotone VIPs. We present the first local
gradient descent-accent algorithms with provable improved communication
complexity for solving distributed variational inequalities on heterogeneous
data. The general algorithmic framework recovers state-of-the-art algorithms
and their sharp convergence guarantees when the setting is specialized to
minimization or minimax optimization problems. Finally, we demonstrate the
strong performance of the proposed algorithms compared to state-of-the-art
methods when solving federated minimax optimization problems
Stochastic Frank-Wolfe for Composite Convex Minimization
A broad class of convex optimization problems can be formulated as a
semidefinite program (SDP), minimization of a convex function over the
positive-semidefinite cone subject to some affine constraints. The majority of
classical SDP solvers are designed for the deterministic setting where problem
data is readily available. In this setting, generalized conditional gradient
methods (aka Frank-Wolfe-type methods) provide scalable solutions by leveraging
the so-called linear minimization oracle instead of the projection onto the
semidefinite cone. Most problems in machine learning and modern engineering
applications, however, contain some degree of stochasticity. In this work, we
propose the first conditional-gradient-type method for solving stochastic
optimization problems under affine constraints. Our method guarantees
convergence rate in expectation on the objective
residual and on the feasibility gap
Accelerated Primal-dual Scheme for a Class of Stochastic Nonconvex-concave Saddle Point Problems
Stochastic nonconvex-concave min-max saddle point problems appear in many
machine learning and control problems including distributionally robust
optimization, generative adversarial networks, and adversarial learning. In
this paper, we consider a class of nonconvex saddle point problems where the
objective function satisfies the Polyak-{\L}ojasiewicz condition with respect
to the minimization variable and it is concave with respect to the maximization
variable. The existing methods for solving nonconvex-concave saddle point
problems often suffer from slow convergence and/or contain multiple loops. Our
main contribution lies in proposing a novel single-loop accelerated primal-dual
algorithm with new convergence rate results appearing for the first time in the
literature, to the best of our knowledge. In particular, in the stochastic
regime, we demonstrate a convergence rate of to
find an -gap solution which can be improved to in deterministic setting
A stochastic two-step inertial Bregman proximal alternating linearized minimization algorithm for nonconvex and nonsmooth problems
In this paper, for solving a broad class of large-scale nonconvex and
nonsmooth optimization problems, we propose a stochastic two step inertial
Bregman proximal alternating linearized minimization (STiBPALM) algorithm with
variance-reduced stochastic gradient estimators. And we show that SAGA and
SARAH are variance-reduced gradient estimators. Under expectation conditions
with the Kurdyka-Lojasiewicz property and some suitable conditions on the
parameters, we obtain that the sequence generated by the proposed algorithm
converges to a critical point. And the general convergence rate is also
provided. Numerical experiments on sparse nonnegative matrix factorization and
blind image-deblurring are presented to demonstrate the performance of the
proposed algorithm.Comment: arXiv admin note: text overlap with arXiv:2002.12266 by other author
- …