1,047 research outputs found
An Asynchronous Parallel Approach to Sparse Recovery
Asynchronous parallel computing and sparse recovery are two areas that have
received recent interest. Asynchronous algorithms are often studied to solve
optimization problems where the cost function takes the form , with a common assumption that each is sparse; that is, each
acts only on a small number of components of . Sparse
recovery problems, such as compressed sensing, can be formulated as
optimization problems, however, the cost functions are dense with respect
to the components of , and instead the signal is assumed to be sparse,
meaning that it has only non-zeros where . Here we address how one
may use an asynchronous parallel architecture when the cost functions are
not sparse in , but rather the signal is sparse. We propose an
asynchronous parallel approach to sparse recovery via a stochastic greedy
algorithm, where multiple processors asynchronously update a vector in shared
memory containing information on the estimated signal support. We include
numerical simulations that illustrate the potential benefits of our proposed
asynchronous method.Comment: 5 pages, 2 figure
Asynchronous Distributed Semi-Stochastic Gradient Optimization
With the recent proliferation of large-scale learning problems,there have
been a lot of interest on distributed machine learning algorithms, particularly
those that are based on stochastic gradient descent (SGD) and its variants.
However, existing algorithms either suffer from slow convergence due to the
inherent variance of stochastic gradients, or have a fast linear convergence
rate but at the expense of poorer solution quality. In this paper, we combine
their merits by proposing a fast distributed asynchronous SGD-based algorithm
with variance reduction. A constant learning rate can be used, and it is also
guaranteed to converge linearly to the optimal solution. Experiments on the
Google Cloud Computing Platform demonstrate that the proposed algorithm
outperforms state-of-the-art distributed asynchronous algorithms in terms of
both wall clock time and solution quality
Robust Fully-Asynchronous Methods for Distributed Training over General Architecture
Perfect synchronization in distributed machine learning problems is
inefficient and even impossible due to the existence of latency, package losses
and stragglers. We propose a Robust Fully-Asynchronous Stochastic Gradient
Tracking method (R-FAST), where each device performs local computation and
communication at its own pace without any form of synchronization. Different
from existing asynchronous distributed algorithms, R-FAST can eliminate the
impact of data heterogeneity across devices and allow for packet losses by
employing a robust gradient tracking strategy that relies on properly designed
auxiliary variables for tracking and buffering the overall gradient vector.
More importantly, the proposed method utilizes two spanning-tree graphs for
communication so long as both share at least one common root, enabling flexible
designs in communication architectures. We show that R-FAST converges in
expectation to a neighborhood of the optimum with a geometric rate for smooth
and strongly convex objectives; and to a stationary point with a sublinear rate
for general non-convex settings. Extensive experiments demonstrate that R-FAST
runs 1.5-2 times faster than synchronous benchmark algorithms, such as
Ring-AllReduce and D-PSGD, while still achieving comparable accuracy, and
outperforms existing asynchronous SOTA algorithms, such as AD-PSGD and OSGP,
especially in the presence of stragglers
- β¦