3 research outputs found
Practical Precoding via Asynchronous Stochastic Successive Convex Approximation
We consider stochastic optimization of a smooth non-convex loss function with
a convex non-smooth regularizer. In the online setting, where a single sample
of the stochastic gradient of the loss is available at every iteration, the
problem can be solved using the proximal stochastic gradient descent (SGD)
algorithm and its variants. However in many problems, especially those arising
in communications and signal processing, information beyond the stochastic
gradient may be available thanks to the structure of the loss function. Such
extra-gradient information is not used by SGD, but has been shown to be useful,
for instance in the context of stochastic expectation-maximization, stochastic
majorization-minimization, and stochastic successive convex approximation (SCA)
approaches. By constructing a stochastic strongly convex surrogates of the loss
function at every iteration, the stochastic SCA algorithms can exploit the
structural properties of the loss function and achieve superior empirical
performance as compared to the SGD.
In this work, we take a closer look at the stochastic SCA algorithm and
develop its asynchronous variant which can be used for resource allocation in
wireless networks. While the stochastic SCA algorithm is known to converge
asymptotically, its iteration complexity has not been well-studied, and is the
focus of the current work. The insights obtained from the non-asymptotic
analysis allow us to develop a more practical asynchronous variant of the
stochastic SCA algorithm which allows the use of surrogates calculated in
earlier iterations. We characterize precise bound on the maximum delay the
algorithm can tolerate, while still achieving the same convergence rate. We
apply the algorithm to the problem of linear precoding in wireless sensor
networks, where it can be implemented at low complexity but is shown to perform
well in practice
Distributed Inexact Successive Convex Approximation ADMM: Analysis-Part I
In this two-part work, we propose an algorithmic framework for solving
non-convex problems whose objective function is the sum of a number of smooth
component functions plus a convex (possibly non-smooth) or/and smooth (possibly
non-convex) regularization function. The proposed algorithm incorporates ideas
from several existing approaches such as alternate direction method of
multipliers (ADMM), successive convex approximation (SCA), distributed and
asynchronous algorithms, and inexact gradient methods. Different from a number
of existing approaches, however, the proposed framework is flexible enough to
incorporate a class of non-convex objective functions, allow distributed
operation with and without a fusion center, and include variance reduced
methods as special cases. Remarkably, the proposed algorithms are robust to
uncertainties arising from random, deterministic, and adversarial sources. The
part I of the paper develops two variants of the algorithm under very mild
assumptions and establishes first-order convergence rate guarantees. The proof
developed here allows for generic errors and delays, paving the way for
different variance-reduced, asynchronous, and stochastic implementations,
outlined and evaluated in part II
Conservative Stochastic Optimization with Expectation Constraints
This paper considers stochastic convex optimization problems where the
objective and constraint functions involve expectations with respect to the
data indices or environmental variables, in addition to deterministic convex
constraints on the domain of the variables. Although the setting is generic and
arises in different machine learning applications, online and efficient
approaches for solving such problems have not been widely studied. Since the
underlying data distribution is unknown a priori, a closed-form solution is
generally not available, and classical deterministic optimization paradigms are
not applicable. State-of-the-art approaches, such as those using the saddle
point framework, can ensure that the optimality gap as well as the constraint
violation decay as \O\left(T^{-\frac{1}{2}}\right) where is the number of
stochastic gradients. The domain constraints are assumed simple and handled via
projection at every iteration. In this work, we propose a novel conservative
stochastic optimization algorithm (CSOA) that achieves zero constraint
violation and \O\left(T^{-\frac{1}{2}}\right) optimality gap.
Further, the projection operation (for scenarios when calculating projection
is expensive) in the proposed algorithm can be avoided by considering the
conditional gradient or Frank-Wolfe (FW) variant of the algorithm. The
state-of-the-art stochastic FW variants achieve an optimality gap of
\O\left(T^{-\frac{1}{3}}\right) after iterations, though these algorithms
have not been applied to problems with functional expectation constraints. In
this work, we propose the FW-CSOA algorithm that is not only projection-free
but also achieves zero constraint violation with
\O\left(T^{-\frac{1}{4}}\right) decay of the optimality gap. The efficacy of
the proposed algorithms is tested on two relevant problems: fair classification
and structured matrix completion