6,022 research outputs found
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable
There has been significant recent interest in parallel graph processing due
to the need to quickly analyze the large graphs available today. Many graph
codes have been designed for distributed memory or external memory. However,
today even the largest publicly-available real-world graph (the Hyperlink Web
graph with over 3.5 billion vertices and 128 billion edges) can fit in the
memory of a single commodity multicore server. Nevertheless, most experimental
work in the literature report results on much smaller graphs, and the ones for
the Hyperlink graph use distributed or external memory. Therefore, it is
natural to ask whether we can efficiently solve a broad class of graph problems
on this graph in memory.
This paper shows that theoretically-efficient parallel graph algorithms can
scale to the largest publicly-available graphs using a single machine with a
terabyte of RAM, processing them in minutes. We give implementations of
theoretically-efficient parallel algorithms for 20 important graph problems. We
also present the optimizations and techniques that we used in our
implementations, which were crucial in enabling us to process these large
graphs quickly. We show that the running times of our implementations
outperform existing state-of-the-art implementations on the largest real-world
graphs. For many of the problems that we consider, this is the first time they
have been solved on graphs at this scale. We have made the implementations
developed in this work publicly-available as the Graph-Based Benchmark Suite
(GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium
on Parallelism in Algorithms and Architectures (SPAA), 201
Fast Local Computation Algorithms
For input , let denote the set of outputs that are the "legal"
answers for a computational problem . Suppose and members of are
so large that there is not time to read them in their entirety. We propose a
model of {\em local computation algorithms} which for a given input ,
support queries by a user to values of specified locations in a legal
output . When more than one legal output exists for a given
, the local computation algorithm should output in a way that is consistent
with at least one such . Local computation algorithms are intended to
distill the common features of several concepts that have appeared in various
algorithmic subfields, including local distributed computation, local
algorithms, locally decodable codes, and local reconstruction.
We develop a technique, based on known constructions of small sample spaces
of -wise independent random variables and Beck's analysis in his algorithmic
approach to the Lov{\'{a}}sz Local Lemma, which under certain conditions can be
applied to construct local computation algorithms that run in {\em
polylogarithmic} time and space. We apply this technique to maximal independent
set computations, scheduling radio network broadcasts, hypergraph coloring and
satisfying -SAT formulas.Comment: A preliminary version of this paper appeared in ICS 2011, pp. 223-23
Best of Two Local Models: Local Centralized and Local Distributed Algorithms
We consider two models of computation: centralized local algorithms and local
distributed algorithms. Algorithms in one model are adapted to the other model
to obtain improved algorithms.
Distributed vertex coloring is employed to design improved centralized local
algorithms for: maximal independent set, maximal matching, and an approximation
scheme for maximum (weighted) matching over bounded degree graphs. The
improvement is threefold: the algorithms are deterministic, stateless, and the
number of probes grows polynomially in , where is the number of
vertices of the input graph.
The recursive centralized local improvement technique by Nguyen and
Onak~\cite{onak2008} is employed to obtain an improved distributed
approximation scheme for maximum (weighted) matching. The improvement is
twofold: we reduce the number of rounds from to for a
wide range of instances and, our algorithms are deterministic rather than
randomized
Distributed Maximum Matching in Bounded Degree Graphs
We present deterministic distributed algorithms for computing approximate
maximum cardinality matchings and approximate maximum weight matchings. Our
algorithm for the unweighted case computes a matching whose size is at least
(1-\eps) times the optimal in \Delta^{O(1/\eps)} +
O\left(\frac{1}{\eps^2}\right) \cdot\log^*(n) rounds where is the number
of vertices in the graph and is the maximum degree. Our algorithm for
the edge-weighted case computes a matching whose weight is at least (1-\eps)
times the optimal in
\log(\min\{1/\wmin,n/\eps\})^{O(1/\eps)}\cdot(\Delta^{O(1/\eps)}+\log^*(n))
rounds for edge-weights in [\wmin,1].
The best previous algorithms for both the unweighted case and the weighted
case are by Lotker, Patt-Shamir, and Pettie~(SPAA 2008). For the unweighted
case they give a randomized (1-\eps)-approximation algorithm that runs in
O((\log(n)) /\eps^3) rounds. For the weighted case they give a randomized
(1/2-\eps)-approximation algorithm that runs in O(\log(\eps^{-1}) \cdot
\log(n)) rounds. Hence, our results improve on the previous ones when the
parameters , \eps and \wmin are constants (where we reduce the
number of runs from to ), and more generally when
, 1/\eps and 1/\wmin are sufficiently slowly increasing functions
of . Moreover, our algorithms are deterministic rather than randomized.Comment: arXiv admin note: substantial text overlap with arXiv:1402.379
Fast Distributed Approximation for Max-Cut
Finding a maximum cut is a fundamental task in many computational settings.
Surprisingly, it has been insufficiently studied in the classic distributed
settings, where vertices communicate by synchronously sending messages to their
neighbors according to the underlying graph, known as the or
models. We amend this by obtaining almost optimal
algorithms for Max-Cut on a wide class of graphs in these models. In
particular, for any , we develop randomized approximation
algorithms achieving a ratio of to the optimum for Max-Cut on
bipartite graphs in the model, and on general graphs in the
model.
We further present efficient deterministic algorithms, including a
-approximation for Max-Dicut in our models, thus improving the best known
(randomized) ratio of . Our algorithms make non-trivial use of the greedy
approach of Buchbinder et al. (SIAM Journal on Computing, 2015) for maximizing
an unconstrained (non-monotone) submodular function, which may be of
independent interest
Distributed Approximation of Maximum Independent Set and Maximum Matching
We present a simple distributed -approximation algorithm for maximum
weight independent set (MaxIS) in the model which completes
in rounds, where is the maximum
degree, is the number of rounds needed to compute a maximal
independent set (MIS) on , and is the maximum weight of a node. %Whether
our algorithm is randomized or deterministic depends on the \texttt{MIS}
algorithm used as a black-box.
Plugging in the best known algorithm for MIS gives a randomized solution in
rounds, where is the number of nodes.
We also present a deterministic -round algorithm based
on coloring.
We then show how to use our MaxIS approximation algorithms to compute a
-approximation for maximum weight matching without incurring any additional
round penalty in the model. We use a known reduction for
simulating algorithms on the line graph while incurring congestion, but we show
our algorithm is part of a broad family of \emph{local aggregation algorithms}
for which we describe a mechanism that allows the simulation to run in the
model without an additional overhead.
Next, we show that for maximum weight matching, relaxing the approximation
factor to () allows us to devise a distributed algorithm
requiring rounds for any constant
. For the unweighted case, we can even obtain a
-approximation in this number of rounds. These algorithms are
the first to achieve the provably optimal round complexity with respect to
dependency on
Optimal Dynamic Distributed MIS
Finding a maximal independent set (MIS) in a graph is a cornerstone task in
distributed computing. The local nature of an MIS allows for fast solutions in
a static distributed setting, which are logarithmic in the number of nodes or
in their degrees. The result trivially applies for the dynamic distributed
model, in which edges or nodes may be inserted or deleted. In this paper, we
take a different approach which exploits locality to the extreme, and show how
to update an MIS in a dynamic distributed setting, either \emph{synchronous} or
\emph{asynchronous}, with only \emph{a single adjustment} and in a single
round, in expectation. These strong guarantees hold for the \emph{complete
fully dynamic} setting: Insertions and deletions, of edges as well as nodes,
gracefully and abruptly. This strongly separates the static and dynamic
distributed models, as super-constant lower bounds exist for computing an MIS
in the former.
Our results are obtained by a novel analysis of the surprisingly simple
solution of carefully simulating the greedy \emph{sequential} MIS algorithm
with a random ordering of the nodes. As such, our algorithm has a direct
application as a -approximation algorithm for correlation clustering. This
adds to the important toolbox of distributed graph decompositions, which are
widely used as crucial building blocks in distributed computing.
Finally, our algorithm enjoys a useful \emph{history-independence} property,
meaning the output is independent of the history of topology changes that
constructed that graph. This means the output cannot be chosen, or even biased,
by the adversary in case its goal is to prevent us from optimizing some
objective function.Comment: 19 pages including appendix and reference
Parallel Maximum Clique Algorithms with Applications to Network Analysis and Storage
We propose a fast, parallel maximum clique algorithm for large sparse graphs
that is designed to exploit characteristics of social and information networks.
The method exhibits a roughly linear runtime scaling over real-world networks
ranging from 1000 to 100 million nodes. In a test on a social network with 1.8
billion edges, the algorithm finds the largest clique in about 20 minutes. Our
method employs a branch and bound strategy with novel and aggressive pruning
techniques. For instance, we use the core number of a vertex in combination
with a good heuristic clique finder to efficiently remove the vast majority of
the search space. In addition, we parallelize the exploration of the search
tree. During the search, processes immediately communicate changes to upper and
lower bounds on the size of maximum clique, which occasionally results in a
super-linear speedup because vertices with large search spaces can be pruned by
other processes. We apply the algorithm to two problems: to compute temporal
strong components and to compress graphs.Comment: 11 page
The Bounded Edge Coloring Problem and Offline Crossbar Scheduling
This paper introduces a variant of the classical edge coloring problem in
graphs that can be applied to an offline scheduling problem for crossbar
switches. We show that the problem is NP-complete, develop three lower bounds
bounds on the optimal solution value and evaluate the performance of several
approximation algorithms, both analytically and experimentally. We show how to
approximate an optimal solution with a worst-case performance ratio of
and our experimental results demonstrate that the best algorithms produce
results that very closely track a lower bound
- …