Search CORE

8,000 research outputs found

An Asynchronous Parallel Approach to Sparse Recovery

Author: Needell Deanna
Woolf Tina
Publication venue
Publication date: 12/01/2017
Field of study

Asynchronous parallel computing and sparse recovery are two areas that have received recent interest. Asynchronous algorithms are often studied to solve optimization problems where the cost function takes the form

\sum_{i=1}^M f_i(x)

, with a common assumption that each

f_i

is sparse; that is, each

f_i

acts only on a small number of components of

x\in\mathbb{R}^n

. Sparse recovery problems, such as compressed sensing, can be formulated as optimization problems, however, the cost functions

f_i

are dense with respect to the components of

x

, and instead the signal

x

is assumed to be sparse, meaning that it has only

s

non-zeros where

s\ll n

. Here we address how one may use an asynchronous parallel architecture when the cost functions

f_i

are not sparse in

x

, but rather the signal

x

is sparse. We propose an asynchronous parallel approach to sparse recovery via a stochastic greedy algorithm, where multiple processors asynchronously update a vector in shared memory containing information on the estimated signal support. We include numerical simulations that illustrate the potential benefits of our proposed asynchronous method.Comment: 5 pages, 2 figure

arXiv.org e-Print Archive

Crossref

CRAFT: A library for easier application-level Checkpoint/Restart and Automatic Fault Tolerance

Author: Hager Georg
Kreutzer Moritz
Shahzad Faisal
Thies Jonas
Wellein Gerhard
Zeiser Thomas
Publication venue
Publication date: 07/08/2017
Field of study

In order to efficiently use the future generations of supercomputers, fault tolerance and power consumption are two of the prime challenges anticipated by the High Performance Computing (HPC) community. Checkpoint/Restart (CR) has been and still is the most widely used technique to deal with hard failures. Application-level CR is the most effective CR technique in terms of overhead efficiency but it takes a lot of implementation effort. This work presents the implementation of our C++ based library CRAFT (Checkpoint-Restart and Automatic Fault Tolerance), which serves two purposes. First, it provides an extendable library that significantly eases the implementation of application-level checkpointing. The most basic and frequently used checkpoint data types are already part of CRAFT and can be directly used out of the box. The library can be easily extended to add more data types. As means of overhead reduction, the library offers a build-in asynchronous checkpointing mechanism and also supports the Scalable Checkpoint/Restart (SCR) library for node level checkpointing. Second, CRAFT provides an easier interface for User-Level Failure Mitigation (ULFM) based dynamic process recovery, which significantly reduces the complexity and effort of failure detection and communication recovery mechanism. By utilizing both functionalities together, applications can write application-level checkpoints and recover dynamically from process failures with very limited programming effort. This work presents the design and use of our library in detail. The associated overheads are thoroughly analyzed using several benchmarks

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization

Author: Lacoste-Julien Simon
Leblond Rémi
Pedregosa Fabian
Publication venue
Publication date: 05/11/2017
Field of study

Due to their simplicity and excellent performance, parallel asynchronous variants of stochastic gradient descent have become popular methods to solve a wide range of large-scale optimization problems on multi-core architectures. Yet, despite their practical success, support for nonsmooth objectives is still lacking, making them unsuitable for many problems of interest in machine learning, such as the Lasso, group Lasso or empirical risk minimization with convex constraints. In this work, we propose and analyze ProxASAGA, a fully asynchronous sparse method inspired by SAGA, a variance reduced incremental gradient algorithm. The proposed method is easy to implement and significantly outperforms the state of the art on several nonsmooth, large-scale problems. We prove that our method achieves a theoretical linear speedup with respect to the sequential version under assumptions on the sparsity of gradients and block-separability of the proximal term. Empirical benchmarks on a multi-core architecture illustrate practical speedups of up to 12x on a 20-core machine.Comment: Appears in Advances in Neural Information Processing Systems 30 (NIPS 2017), 28 page

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server