1,439 research outputs found
An Asynchronous Parallel Approach to Sparse Recovery
Asynchronous parallel computing and sparse recovery are two areas that have
received recent interest. Asynchronous algorithms are often studied to solve
optimization problems where the cost function takes the form , with a common assumption that each is sparse; that is, each
acts only on a small number of components of . Sparse
recovery problems, such as compressed sensing, can be formulated as
optimization problems, however, the cost functions are dense with respect
to the components of , and instead the signal is assumed to be sparse,
meaning that it has only non-zeros where . Here we address how one
may use an asynchronous parallel architecture when the cost functions are
not sparse in , but rather the signal is sparse. We propose an
asynchronous parallel approach to sparse recovery via a stochastic greedy
algorithm, where multiple processors asynchronously update a vector in shared
memory containing information on the estimated signal support. We include
numerical simulations that illustrate the potential benefits of our proposed
asynchronous method.Comment: 5 pages, 2 figure
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization
Due to their simplicity and excellent performance, parallel asynchronous
variants of stochastic gradient descent have become popular methods to solve a
wide range of large-scale optimization problems on multi-core architectures.
Yet, despite their practical success, support for nonsmooth objectives is still
lacking, making them unsuitable for many problems of interest in machine
learning, such as the Lasso, group Lasso or empirical risk minimization with
convex constraints.
In this work, we propose and analyze ProxASAGA, a fully asynchronous sparse
method inspired by SAGA, a variance reduced incremental gradient algorithm. The
proposed method is easy to implement and significantly outperforms the state of
the art on several nonsmooth, large-scale problems. We prove that our method
achieves a theoretical linear speedup with respect to the sequential version
under assumptions on the sparsity of gradients and block-separability of the
proximal term. Empirical benchmarks on a multi-core architecture illustrate
practical speedups of up to 12x on a 20-core machine.Comment: Appears in Advances in Neural Information Processing Systems 30 (NIPS
2017), 28 page
Parallel coordinate descent for the Adaboost problem
We design a randomised parallel version of Adaboost based on previous studies
on parallel coordinate descent. The algorithm uses the fact that the logarithm
of the exponential loss is a function with coordinate-wise Lipschitz continuous
gradient, in order to define the step lengths. We provide the proof of
convergence for this randomised Adaboost algorithm and a theoretical
parallelisation speedup factor. We finally provide numerical examples on
learning problems of various sizes that show that the algorithm is competitive
with concurrent approaches, especially for large scale problems.Comment: 7 pages, 3 figures, extended version of the paper presented to
ICMLA'1
- …