59,752 research outputs found
Distributed soft thresholding for sparse signal recovery
In this paper, we address the problem of distributed sparse recovery of
signals acquired via compressed measurements in a sensor network. We propose a
new class of distributed algorithms to solve Lasso regression problems, when
the communication to a fusion center is not possible, e.g., due to
communication cost or privacy reasons. More precisely, we introduce a
distributed iterative soft thresholding algorithm (DISTA) that consists of
three steps: an averaging step, a gradient step, and a soft thresholding
operation. We prove the convergence of DISTA in networks represented by regular
graphs, and we compare it with existing methods in terms of performance,
memory, and complexity.Comment: Revised version. Main improvements: extension of the convergence
theorem to regular graphs; new numerical results and comparisons with other
algorithm
Parallel Newton Method for High-Speed Viscous Separated Flowfields. G.U. Aero Report 9210
This paper presents a new technique to parallelize Newton method for the locally
conical approximate, laminar Navier-Stokes solutions on a distributed memory parallel
computer. The method uses Newton's method for nonlinear systems of equations to find
steady-state solutions. The parallelization is based on a parallel iterative solver for large
sparse non-symmetric linear system. The method of distributed storage of the matrix data
results in the corresponding geometric domain decomposition. The large sparse Jacobian
matrix is then generated distributively in each subdomain. Since the numerical algorithms
on the global domain are unchanged, the convergence and the accuracy of the original
sequential scheme are maintained, and no inner boundary condition is needed
SASG: Sparsification with Adaptive Stochastic Gradients for Communication-efficient Distributed Learning
Stochastic optimization algorithms implemented on distributed computing
architectures are increasingly used to tackle large-scale machine learning
applications. A key bottleneck in such distributed systems is the communication
overhead for exchanging information such as stochastic gradients between
different workers. Sparse communication with memory and the adaptive
aggregation methodology are two successful frameworks among the various
techniques proposed to address this issue. In this paper, we creatively exploit
the advantages of Sparse communication and Adaptive aggregated Stochastic
Gradients to design a communication-efficient distributed algorithm named SASG.
Specifically, we first determine the workers that need to communicate based on
the adaptive aggregation rule and then sparse this transmitted information.
Therefore, our algorithm reduces both the overhead of communication rounds and
the number of communication bits in the distributed system. We define an
auxiliary sequence and give convergence results of the algorithm with the help
of Lyapunov function analysis. Experiments on training deep neural networks
show that our algorithm can significantly reduce the number of communication
rounds and bits compared to the previous methods, with little or no impact on
training and testing accuracy.Comment: 12 pages, 5 figure
Exact Distributed Stochastic Block Partitioning
Stochastic block partitioning (SBP) is a community detection algorithm that
is highly accurate even on graphs with a complex community structure, but its
inherently serial nature hinders its widespread adoption by the wider
scientific community. To make it practical to analyze large real-world graphs
with SBP, there is a growing need to parallelize and distribute the algorithm.
The current state-of-the-art distributed SBP algorithm is a divide-and-conquer
approach that limits communication between compute nodes until the end of
inference. This leads to the breaking of computational dependencies, which
causes convergence issues as the number of compute nodes increases, and when
the graph is sufficiently sparse. In this paper, we introduce EDiSt - an exact
distributed stochastic block partitioning algorithm. Under EDiSt, compute nodes
periodically share community assignments during inference. Due to this
additional communication, EDiSt improves upon the divide-and-conquer algorithm
by allowing it to scale out to a larger number of compute nodes without
suffering from convergence issues, even on sparse graphs. We show that EDiSt
provides speedups of up to 23.8X over the divide-and-conquer approach, and
speedups up to 38.0X over shared memory parallel SBP when scaled out to 64
compute nodes
Recommended from our members
Preparing sparse solvers for exascale computing.
Sparse solvers provide essential functionality for a wide variety of scientific applications. Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms. This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms. We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential. The efforts described here are works in progress, highlighting current success and upcoming challenges. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'
- …