89 research outputs found
Comparing MapReduce and pipeline implementations for counting triangles
A generalized method to define the Divide & Conquer paradigm in order to have processors acting on its own data and scheduled in a
parallel fashion. MapReduce is a programming model that follows this paradigm, and allows for the definition of efficient solutions by both decomposing a problem into steps on subsets of the input data
and combining the results of each step to produce final results. Albeit used for the implementation of a wide variety of computational problems, MapReduce performance can be negatively affected
whenever the replication factor grows or the size of the input is larger than the resources available at each processor. In this paper we show an alternative approach to implement the Divide & Conquer
paradigm, named pipeline. The main features of pipeline are illustrated on a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To evaluate the properties of pipeline, a dynamic pipeline of processes and an ad-hoc version of MapReduce are implemented in the language Go, exploiting its ability to deal with channels and spawned processes.
An empirical evaluation is conducted on graphs of different sizes and densities. Observed results suggest that pipeline allows for the implementation of an efficient solution of the problem of counting
triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.Peer ReviewedPostprint (published version
Distribution Policies for Datalog
Modern data management systems extensively use parallelism to speed up query processing over massive volumes of data. This trend has inspired a rich line of research on how to formally reason about the parallel complexity of join computation. In this paper, we go beyond joins and study the parallel evaluation of recursive queries. We introduce a novel framework to reason about multi-round evaluation of Datalog programs, which combines implicit predicate restriction with distribution policies to allow expressing a combination of data-parallel and query-parallel evaluation strategies. Using our framework, we reason about key properties of distributed Datalog evaluation, including parallel-correctness of the evaluation strategy, disjointness of the computation effort, and bounds on the number of communication rounds
MapReduce vs. pipelining counting triangles
In this paper we follow an alternative approach named pipeline, to implement a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To be concrete, we implement a dynamic pipeline of processes and an ad-hoc MapReduce version using the language Go. We explote the ability of Go language to deal with channels and spawned processes. An empirical evaluation is conducted on graphs of different size and density. Observed results suggest that pipeline allows for the implementation of an efficient solution of the problem of counting triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.Peer ReviewedPostprint (published version
Comparing MapReduce and pipeline implementations for counting triangles
A common method to define a parallel solution for a computational problem consists in finding a way to use the Divide and Conquer paradigm in order to have processors acting on its own data and scheduled in a parallel fashion. MapReduce is a programming model that follows this paradigm, and allows for the definition of efficient solutions by both decomposing a problem into steps on subsets of the input data and combining the results of each step to produce final results. Albeit used for the implementation of a wide variety of computational problems, MapReduce performance can be negatively affected whenever the replication factor grows or the size of the input is larger than the resources available at each processor. In this paper we show an alternative approach to implement the Divide and Conquer paradigm, named dynamic pipeline. The main features of dynamic pipelines are illustrated on a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To evaluate the properties of pipeline, a dynamic pipeline of processes and an ad-hoc version of MapReduce are implemented in the language Go, exploiting its ability to deal with channels and spawned processes. An empirical evaluation is conducted on graphs of different topologies, sizes, and densities. Observed results suggest that dynamic pipelines allows for an efficient implementation of the problem of counting triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.Peer ReviewedPostprint (published version
Tight Bounds on the Round Complexity of the Distributed Maximum Coverage Problem
We study the maximum -set coverage problem in the following distributed
setting. A collection of sets over a universe is
partitioned across machines and the goal is to find sets whose union
covers the most number of elements. The computation proceeds in synchronous
rounds. In each round, all machines simultaneously send a message to a central
coordinator who then communicates back to all machines a summary to guide the
computation for the next round. At the end, the coordinator outputs the answer.
The main measures of efficiency in this setting are the approximation ratio of
the returned solution, the communication cost of each machine, and the number
of rounds of computation.
Our main result is an asymptotically tight bound on the tradeoff between
these measures for the distributed maximum coverage problem. We first show that
any -round protocol for this problem either incurs a communication cost of or only achieves an approximation factor of
. This implies that any protocol that simultaneously achieves
good approximation ratio ( approximation) and good communication cost
( communication per machine), essentially requires
logarithmic (in ) number of rounds. We complement our lower bound result by
showing that there exist an -round protocol that achieves an
-approximation (essentially best possible) with a communication
cost of as well as an -round protocol that achieves a
-approximation with only communication per each
machine (essentially best possible).
We further use our results in this distributed setting to obtain new bounds
for the maximum coverage problem in two other main models of computation for
massive datasets, namely, the dynamic streaming model and the MapReduce model
Answering Conjunctive Queries under Updates
We consider the task of enumerating and counting answers to -ary
conjunctive queries against relational databases that may be updated by
inserting or deleting tuples. We exhibit a new notion of q-hierarchical
conjunctive queries and show that these can be maintained efficiently in the
following sense. During a linear time preprocessing phase, we can build a data
structure that enables constant delay enumeration of the query results; and
when the database is updated, we can update the data structure and restart the
enumeration phase within constant time. For the special case of self-join free
conjunctive queries we obtain a dichotomy: if a query is not q-hierarchical,
then query enumeration with sublinear delay and sublinear update time
(and arbitrary preprocessing time) is impossible.
For answering Boolean conjunctive queries and for the more general problem of
counting the number of solutions of k-ary queries we obtain complete
dichotomies: if the query's homomorphic core is q-hierarchical, then size of
the the query result can be computed in linear time and maintained with
constant update time. Otherwise, the size of the query result cannot be
maintained with sublinear update time. All our lower bounds rely on the
OMv-conjecture, a conjecture on the hardness of online matrix-vector
multiplication that has recently emerged in the field of fine-grained
complexity to characterise the hardness of dynamic problems. The lower bound
for the counting problem additionally relies on the orthogonal vectors
conjecture, which in turn is implied by the strong exponential time hypothesis.
By sublinear we mean for some
, where is the size of the active domain of the current
database
Streaming Complexity of Spanning Tree Computation
The semi-streaming model is a variant of the streaming model frequently used for the computation of graph problems. It allows the edges of an n-node input graph to be read sequentially in p passes using OÌ(n) space. If the list of edges includes deletions, then the model is called the turnstile model; otherwise it is called the insertion-only model. In both models, some graph problems, such as spanning trees, k-connectivity, densest subgraph, degeneracy, cut-sparsifier, and (Î+1)-coloring, can be exactly solved or (1+Δ)-approximated in a single pass; while other graph problems, such as triangle detection and unweighted all-pairs shortest paths, are known to require ΩÌ(n) passes to compute. For many fundamental graph problems, the tractability in these models is open. In this paper, we study the tractability of computing some standard spanning trees, including BFS, DFS, and maximum-leaf spanning trees. Our results, in both the insertion-only and the turnstile models, are as follows.
Maximum-Leaf Spanning Trees: This problem is known to be APX-complete with inapproximability constant Ï â [245/244, 2). By constructing an Δ-MLST sparsifier, we show that for every constant Δ > 0, MLST can be approximated in a single pass to within a factor of 1+Δ w.h.p. (albeit in super-polynomial time for Δ †Ï-1 assuming P â NP) and can be approximated in polynomial time in a single pass to within a factor of Ï_n+Δ w.h.p., where Ï_n is the supremum constant that MLST cannot be approximated to within using polynomial time and OÌ(n) space. In the insertion-only model, these algorithms can be deterministic.
BFS Trees: It is known that BFS trees require Ï(1) passes to compute, but the naĂŻve approach needs O(n) passes. We devise a new randomized algorithm that reduces the pass complexity to O(ân), and it offers a smooth tradeoff between pass complexity and space usage. This gives a polynomial separation between single-source and all-pairs shortest paths for unweighted graphs.
DFS Trees: It is unknown whether DFS trees require more than one pass. The current best algorithm by Khan and Mehta [STACS 2019] takes OÌ(h) passes, where h is the height of computed DFS trees. Note that h can be as large as Ω(m/n) for n-node m-edge graphs. Our contribution is twofold. First, we provide a simple alternative proof of this result, via a new connection to sparse certificates for k-node-connectivity. Second, we present a randomized algorithm that reduces the pass complexity to O(ân), and it also offers a smooth tradeoff between pass complexity and space usage.ISSN:1868-896
Worst-Case Optimal Algorithms for Parallel Query Processing
In this paper, we study the communication complexity for the problem of
computing a conjunctive query on a large database in a parallel setting with
servers. In contrast to previous work, where upper and lower bounds on the
communication were specified for particular structures of data (either data
without skew, or data with specific types of skew), in this work we focus on
worst-case analysis of the communication cost. The goal is to find worst-case
optimal parallel algorithms, similar to the work of [18] for sequential
algorithms.
We first show that for a single round we can obtain an optimal worst-case
algorithm. The optimal load for a conjunctive query when all relations have
size equal to is , where is a new query-related
quantity called the edge quasi-packing number, which is different from both the
edge packing number and edge cover number of the query hypergraph. For multiple
rounds, we present algorithms that are optimal for several classes of queries.
Finally, we show a surprising connection to the external memory model, which
allows us to translate parallel algorithms to external memory algorithms. This
technique allows us to recover (within a polylogarithmic factor) several recent
results on the I/O complexity for computing join queries, and also obtain
optimal algorithms for other classes of queries
- âŠ