4,129 research outputs found
An Asynchronous Parallel Randomized Kaczmarz Algorithm
We describe an asynchronous parallel variant of the randomized Kaczmarz (RK)
algorithm for solving the linear system . The analysis shows linear
convergence and indicates that nearly linear speedup can be expected if the
number of processors is bounded by a multiple of the number of rows in
Interconnection network with a shared whiteboard: Impact of (a)synchronicity on computing power
In this work we study the computational power of graph-based models of
distributed computing in which each node additionally has access to a global
whiteboard. A node can read the contents of the whiteboard and, when activated,
can write one message of O(log n) bits on it. When the protocol terminates,
each node computes the output based on the final contents of the whiteboard. We
consider several scheduling schemes for nodes, providing a strict ordering of
their power in terms of the problems which can be solved with exactly one
activation per node. The problems used to separate the models are related to
Maximal Independent Set, detection of cycles of length 4, and BFS spanning tree
constructions
Optimal Dynamic Distributed MIS
Finding a maximal independent set (MIS) in a graph is a cornerstone task in
distributed computing. The local nature of an MIS allows for fast solutions in
a static distributed setting, which are logarithmic in the number of nodes or
in their degrees. The result trivially applies for the dynamic distributed
model, in which edges or nodes may be inserted or deleted. In this paper, we
take a different approach which exploits locality to the extreme, and show how
to update an MIS in a dynamic distributed setting, either \emph{synchronous} or
\emph{asynchronous}, with only \emph{a single adjustment} and in a single
round, in expectation. These strong guarantees hold for the \emph{complete
fully dynamic} setting: Insertions and deletions, of edges as well as nodes,
gracefully and abruptly. This strongly separates the static and dynamic
distributed models, as super-constant lower bounds exist for computing an MIS
in the former.
Our results are obtained by a novel analysis of the surprisingly simple
solution of carefully simulating the greedy \emph{sequential} MIS algorithm
with a random ordering of the nodes. As such, our algorithm has a direct
application as a -approximation algorithm for correlation clustering. This
adds to the important toolbox of distributed graph decompositions, which are
widely used as crucial building blocks in distributed computing.
Finally, our algorithm enjoys a useful \emph{history-independence} property,
meaning the output is independent of the history of topology changes that
constructed that graph. This means the output cannot be chosen, or even biased,
by the adversary in case its goal is to prevent us from optimizing some
objective function.Comment: 19 pages including appendix and reference
Byzantine Approximate Agreement on Graphs
Consider a distributed system with n processors out of which f can be Byzantine faulty. In the approximate agreement task, each processor i receives an input value x_i and has to decide on an output value y_i such that
1) the output values are in the convex hull of the non-faulty processors\u27 input values,
2) the output values are within distance d of each other.
Classically, the values are assumed to be from an m-dimensional Euclidean space, where m >= 1.
In this work, we study the task in a discrete setting, where input values with some structure expressible as a graph. Namely, the input values are vertices of a finite graph G and the goal is to output vertices that are within distance d of each other in G, but still remain in the graph-induced convex hull of the input values. For d=0, the task reduces to consensus and cannot be solved with a deterministic algorithm in an asynchronous system even with a single crash fault. For any d >= 1, we show that the task is solvable in asynchronous systems when G is chordal and n > (omega+1)f, where omega is the clique number of G. In addition, we give the first Byzantine-tolerant algorithm for a variant of lattice agreement. For synchronous systems, we show tight resilience bounds for the exact variants of these and related tasks over a large class of combinatorial structures
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable
There has been significant recent interest in parallel graph processing due
to the need to quickly analyze the large graphs available today. Many graph
codes have been designed for distributed memory or external memory. However,
today even the largest publicly-available real-world graph (the Hyperlink Web
graph with over 3.5 billion vertices and 128 billion edges) can fit in the
memory of a single commodity multicore server. Nevertheless, most experimental
work in the literature report results on much smaller graphs, and the ones for
the Hyperlink graph use distributed or external memory. Therefore, it is
natural to ask whether we can efficiently solve a broad class of graph problems
on this graph in memory.
This paper shows that theoretically-efficient parallel graph algorithms can
scale to the largest publicly-available graphs using a single machine with a
terabyte of RAM, processing them in minutes. We give implementations of
theoretically-efficient parallel algorithms for 20 important graph problems. We
also present the optimizations and techniques that we used in our
implementations, which were crucial in enabling us to process these large
graphs quickly. We show that the running times of our implementations
outperform existing state-of-the-art implementations on the largest real-world
graphs. For many of the problems that we consider, this is the first time they
have been solved on graphs at this scale. We have made the implementations
developed in this work publicly-available as the Graph-Based Benchmark Suite
(GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium
on Parallelism in Algorithms and Architectures (SPAA), 201
Relaxed Schedulers Can Efficiently Parallelize Iterative Algorithms
There has been significant progress in understanding the parallelism inherent
to iterative sequential algorithms: for many classic algorithms, the depth of
the dependence structure is now well understood, and scheduling techniques have
been developed to exploit this shallow dependence structure for efficient
parallel implementations. A related, applied research strand has studied
methods by which certain iterative task-based algorithms can be efficiently
parallelized via relaxed concurrent priority schedulers. These allow for high
concurrency when inserting and removing tasks, at the cost of executing
superfluous work due to the relaxed semantics of the scheduler.
In this work, we take a step towards unifying these two research directions,
by showing that there exists a family of relaxed priority schedulers that can
efficiently and deterministically execute classic iterative algorithms such as
greedy maximal independent set (MIS) and matching. Our primary result shows
that, given a randomized scheduler with an expected relaxation factor of in
terms of the maximum allowed priority inversions on a task, and any graph on
vertices, the scheduler is able to execute greedy MIS with only an additive
factor of poly() expected additional iterations compared to an exact (but
not scalable) scheduler. This counter-intuitive result demonstrates that the
overhead of relaxation when computing MIS is not dependent on the input size or
structure of the input graph. Experimental results show that this overhead can
be clearly offset by the gain in performance due to the highly scalable
scheduler. In sum, we present an efficient method to deterministically
parallelize iterative sequential algorithms, with provable runtime guarantees
in terms of the number of executed tasks to completion.Comment: PODC 2018, pages 377-386 in proceeding
- …