1,047 research outputs found

    An Asynchronous Parallel Approach to Sparse Recovery

    Full text link
    Asynchronous parallel computing and sparse recovery are two areas that have received recent interest. Asynchronous algorithms are often studied to solve optimization problems where the cost function takes the form βˆ‘i=1Mfi(x)\sum_{i=1}^M f_i(x), with a common assumption that each fif_i is sparse; that is, each fif_i acts only on a small number of components of x∈Rnx\in\mathbb{R}^n. Sparse recovery problems, such as compressed sensing, can be formulated as optimization problems, however, the cost functions fif_i are dense with respect to the components of xx, and instead the signal xx is assumed to be sparse, meaning that it has only ss non-zeros where sβ‰ͺns\ll n. Here we address how one may use an asynchronous parallel architecture when the cost functions fif_i are not sparse in xx, but rather the signal xx is sparse. We propose an asynchronous parallel approach to sparse recovery via a stochastic greedy algorithm, where multiple processors asynchronously update a vector in shared memory containing information on the estimated signal support. We include numerical simulations that illustrate the potential benefits of our proposed asynchronous method.Comment: 5 pages, 2 figure

    Asynchronous Distributed Semi-Stochastic Gradient Optimization

    Full text link
    With the recent proliferation of large-scale learning problems,there have been a lot of interest on distributed machine learning algorithms, particularly those that are based on stochastic gradient descent (SGD) and its variants. However, existing algorithms either suffer from slow convergence due to the inherent variance of stochastic gradients, or have a fast linear convergence rate but at the expense of poorer solution quality. In this paper, we combine their merits by proposing a fast distributed asynchronous SGD-based algorithm with variance reduction. A constant learning rate can be used, and it is also guaranteed to converge linearly to the optimal solution. Experiments on the Google Cloud Computing Platform demonstrate that the proposed algorithm outperforms state-of-the-art distributed asynchronous algorithms in terms of both wall clock time and solution quality

    Robust Fully-Asynchronous Methods for Distributed Training over General Architecture

    Full text link
    Perfect synchronization in distributed machine learning problems is inefficient and even impossible due to the existence of latency, package losses and stragglers. We propose a Robust Fully-Asynchronous Stochastic Gradient Tracking method (R-FAST), where each device performs local computation and communication at its own pace without any form of synchronization. Different from existing asynchronous distributed algorithms, R-FAST can eliminate the impact of data heterogeneity across devices and allow for packet losses by employing a robust gradient tracking strategy that relies on properly designed auxiliary variables for tracking and buffering the overall gradient vector. More importantly, the proposed method utilizes two spanning-tree graphs for communication so long as both share at least one common root, enabling flexible designs in communication architectures. We show that R-FAST converges in expectation to a neighborhood of the optimum with a geometric rate for smooth and strongly convex objectives; and to a stationary point with a sublinear rate for general non-convex settings. Extensive experiments demonstrate that R-FAST runs 1.5-2 times faster than synchronous benchmark algorithms, such as Ring-AllReduce and D-PSGD, while still achieving comparable accuracy, and outperforms existing asynchronous SOTA algorithms, such as AD-PSGD and OSGP, especially in the presence of stragglers
    • …
    corecore