Search CORE

59,752 research outputs found

Distributed soft thresholding for sparse signal recovery

Author: Fosson Sophie M.
Magli Enrico
Ravazzi Chiara
Publication venue
Publication date: 01/01/2013
Field of study

In this paper, we address the problem of distributed sparse recovery of signals acquired via compressed measurements in a sensor network. We propose a new class of distributed algorithms to solve Lasso regression problems, when the communication to a fusion center is not possible, e.g., due to communication cost or privacy reasons. More precisely, we introduce a distributed iterative soft thresholding algorithm (DISTA) that consists of three steps: an averaging step, a gradient step, and a soft thresholding operation. We prove the convergence of DISTA in networks represented by regular graphs, and we compare it with existing methods in terms of performance, memory, and complexity.Comment: Revised version. Main improvements: extension of the convergence theorem to regular graphs; new numerical results and comparisons with other algorithm

arXiv.org e-Print Archive

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Parallel Newton Method for High-Speed Viscous Separated Flowfields. G.U. Aero Report 9210

Author: Qin N.
Richards B.E.
Xu X.
Publication venue: Department of Aerospace Engineering, University of Glasgow
Publication date: 01/04/1992
Field of study

This paper presents a new technique to parallelize Newton method for the locally conical approximate, laminar Navier-Stokes solutions on a distributed memory parallel computer. The method uses Newton's method for nonlinear systems of equations to find steady-state solutions. The parallelization is based on a parallel iterative solver for large sparse non-symmetric linear system. The method of distributed storage of the matrix data results in the corresponding geometric domain decomposition. The large sparse Jacobian matrix is then generated distributively in each subdomain. Since the numerical algorithms on the global domain are unchanged, the convergence and the accuracy of the original sequential scheme are maintained, and no inner boundary condition is needed

Enlighten

SASG: Sparsification with Adaptive Stochastic Gradients for Communication-efficient Distributed Learning

Author: Deng Xiaoge
Li Dongsheng
Sun Tao
Publication venue
Publication date: 07/12/2021
Field of study

Stochastic optimization algorithms implemented on distributed computing architectures are increasingly used to tackle large-scale machine learning applications. A key bottleneck in such distributed systems is the communication overhead for exchanging information such as stochastic gradients between different workers. Sparse communication with memory and the adaptive aggregation methodology are two successful frameworks among the various techniques proposed to address this issue. In this paper, we creatively exploit the advantages of Sparse communication and Adaptive aggregated Stochastic Gradients to design a communication-efficient distributed algorithm named SASG. Specifically, we first determine the workers that need to communicate based on the adaptive aggregation rule and then sparse this transmitted information. Therefore, our algorithm reduces both the overhead of communication rounds and the number of communication bits in the distributed system. We define an auxiliary sequence and give convergence results of the algorithm with the help of Lyapunov function analysis. Experiments on training deep neural networks show that our algorithm can significantly reduce the number of communication rounds and bits compared to the previous methods, with little or no impact on training and testing accuracy.Comment: 12 pages, 5 figure

arXiv.org e-Print Archive

Exact Distributed Stochastic Block Partitioning

Author: Feng Wu-chun
Gleyzer Vitaliy
Kao Edward
Wanye Frank
Publication venue
Publication date: 29/05/2023
Field of study

Stochastic block partitioning (SBP) is a community detection algorithm that is highly accurate even on graphs with a complex community structure, but its inherently serial nature hinders its widespread adoption by the wider scientific community. To make it practical to analyze large real-world graphs with SBP, there is a growing need to parallelize and distribute the algorithm. The current state-of-the-art distributed SBP algorithm is a divide-and-conquer approach that limits communication between compute nodes until the end of inference. This leads to the breaking of computational dependencies, which causes convergence issues as the number of compute nodes increases, and when the graph is sufficiently sparse. In this paper, we introduce EDiSt - an exact distributed stochastic block partitioning algorithm. Under EDiSt, compute nodes periodically share community assignments during inference. Due to this additional communication, EDiSt improves upon the divide-and-conquer algorithm by allowing it to scale out to a larger number of compute nodes without suffering from convergence issues, even on sparse graphs. We show that EDiSt provides speedups of up to 23.8X over the divide-and-conquer approach, and speedups up to 38.0X over shared memory parallel SBP when scaled out to 64 compute nodes

arXiv.org e-Print Archive

Recommended from our members

Preparing sparse solvers for exascale computing.

Author: Anzt Hartwig
Boman Erik
Curfman McInnes Lois
Falgout Rob
Ghysels Pieter
Heroux Michael
Li Xiaoye
Meier Yang Ulrike
Rajamanickam Sivasankaran
Rupp Karl
Smith Barry
Tran Mills Richard
Yamazaki Ichitaro
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

Sparse solvers provide essential functionality for a wide variety of scientific applications. Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms. This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms. We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential. The efforts described here are works in progress, highlighting current success and upcoming challenges. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'

eScholarship - University of California