77 research outputs found
Worst-Case Optimal Algorithms for Parallel Query Processing
In this paper, we study the communication complexity for the problem of
computing a conjunctive query on a large database in a parallel setting with
servers. In contrast to previous work, where upper and lower bounds on the
communication were specified for particular structures of data (either data
without skew, or data with specific types of skew), in this work we focus on
worst-case analysis of the communication cost. The goal is to find worst-case
optimal parallel algorithms, similar to the work of [18] for sequential
algorithms.
We first show that for a single round we can obtain an optimal worst-case
algorithm. The optimal load for a conjunctive query when all relations have
size equal to is , where is a new query-related
quantity called the edge quasi-packing number, which is different from both the
edge packing number and edge cover number of the query hypergraph. For multiple
rounds, we present algorithms that are optimal for several classes of queries.
Finally, we show a surprising connection to the external memory model, which
allows us to translate parallel algorithms to external memory algorithms. This
technique allows us to recover (within a polylogarithmic factor) several recent
results on the I/O complexity for computing join queries, and also obtain
optimal algorithms for other classes of queries
Comparing MapReduce and pipeline implementations for counting triangles
A generalized method to define the Divide & Conquer paradigm in order to have processors acting on its own data and scheduled in a
parallel fashion. MapReduce is a programming model that follows this paradigm, and allows for the definition of efficient solutions by both decomposing a problem into steps on subsets of the input data
and combining the results of each step to produce final results. Albeit used for the implementation of a wide variety of computational problems, MapReduce performance can be negatively affected
whenever the replication factor grows or the size of the input is larger than the resources available at each processor. In this paper we show an alternative approach to implement the Divide & Conquer
paradigm, named pipeline. The main features of pipeline are illustrated on a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To evaluate the properties of pipeline, a dynamic pipeline of processes and an ad-hoc version of MapReduce are implemented in the language Go, exploiting its ability to deal with channels and spawned processes.
An empirical evaluation is conducted on graphs of different sizes and densities. Observed results suggest that pipeline allows for the implementation of an efficient solution of the problem of counting
triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.Peer ReviewedPostprint (published version
MapReduce vs. pipelining counting triangles
In this paper we follow an alternative approach named pipeline, to implement a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To be concrete, we implement a dynamic pipeline of processes and an ad-hoc MapReduce version using the language Go. We explote the ability of Go language to deal with channels and spawned processes. An empirical evaluation is conducted on graphs of different size and density. Observed results suggest that pipeline allows for the implementation of an efficient solution of the problem of counting triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.Peer ReviewedPostprint (published version
Enumerating Subgraphs of Constant Sizes in External Memory
We present an indivisible I/O-efficient algorithm for subgraph enumeration, where the objective is to list all the subgraphs of a massive graph G : = (V, E) that are isomorphic to a pattern graph Q having k = O(1) vertices. Our algorithm performs O((|E|^{k/2})/(M^{{k/2}-1} B) log_{M/B}(|E|/B) + (|E|^?)/(M^{?-1} B) I/Os with high probability, where ? is the fractional edge covering number of Q (it always holds ? ? k/2, regardless of Q), M is the number of words in (internal) memory, and B is the number of words in a disk block. Our solution is optimal in the class of indivisible algorithms for all pattern graphs with ? > k/2. When ? = k/2, our algorithm is still optimal as long as M/B ? (|E|/B)^? for any constant ? > 0
Fully dynamic evaluation for conjunctive queries with free access patterns
We study the problem of answering conjunctive queries with free access patterns under updates. A free access pattern is a partition of the free variables of the query into input and output. The query returns tuples over the output variables given a tuple of values over the input variables.
We introduce a fully dynamic evaluation approach for such queries. It is fully dynamic in the sense that it supports both inserts and deletes of tuples to the input relations. Our approach computes a data structure that supports the enumeration of the output tuples and maintains it under single-tuple updates to the input data. We also give a syntactic characterization of those queries that admit constant time per single-tuple update and whose output tuples can be enumerated with constant delay given an input tuple. Finally, for triangle and hierarchical queries with free access patterns, we chart the complexity trade-offs between the preprocessing time, update time and enumeration delay for such queries. The trade-offs are strongly or weakly Pareto optimal for triangle and a class of hierarchical queries. Their optimality is predicated on the Online Boolean Matrix-Vector Multiplication conjecture
A Near-Optimal Parallel Algorithm for Joining Binary Relations
We present a constant-round algorithm in the massively parallel computation
(MPC) model for evaluating a natural join where every input relation has two
attributes. Our algorithm achieves a load of where
is the total size of the input relations, is the number of machines,
is the join's fractional edge covering number, and hides
a polylogarithmic factor. The load matches a known lower bound up to a
polylogarithmic factor. At the core of the proposed algorithm is a new theorem
(which we name {\em the isolated cartesian product theorem}) that provides
fresh insight into the problem's mathematical structure. Our result implies
that the {\em subgraph enumeration problem}, where the goal is to report all
the occurrences of a constant-sized subgraph pattern, can be settled optimally
(up to a polylogarithmic factor) in the MPC model.Comment: Short versions of this article appeared in PODS'17 and ICDT'20. The
article is under submission to a journal. The red sentences are highlighted
for the journal's reviewer
Comparing MapReduce and pipeline implementations for counting triangles
A common method to define a parallel solution for a computational problem consists in finding a way to use the Divide and Conquer paradigm in order to have processors acting on its own data and scheduled in a parallel fashion. MapReduce is a programming model that follows this paradigm, and allows for the definition of efficient solutions by both decomposing a problem into steps on subsets of the input data and combining the results of each step to produce final results. Albeit used for the implementation of a wide variety of computational problems, MapReduce performance can be negatively affected whenever the replication factor grows or the size of the input is larger than the resources available at each processor. In this paper we show an alternative approach to implement the Divide and Conquer paradigm, named dynamic pipeline. The main features of dynamic pipelines are illustrated on a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To evaluate the properties of pipeline, a dynamic pipeline of processes and an ad-hoc version of MapReduce are implemented in the language Go, exploiting its ability to deal with channels and spawned processes. An empirical evaluation is conducted on graphs of different topologies, sizes, and densities. Observed results suggest that dynamic pipelines allows for an efficient implementation of the problem of counting triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.Peer ReviewedPostprint (published version
A Near-Optimal Parallel Algorithm for Joining Binary Relations
We present a constant-round algorithm in the massively parallel computation
(MPC) model for evaluating a natural join where every input relation has two
attributes. Our algorithm achieves a load of where
is the total size of the input relations, is the number of machines,
is the join's fractional edge covering number, and hides
a polylogarithmic factor. The load matches a known lower bound up to a
polylogarithmic factor. At the core of the proposed algorithm is a new theorem
(which we name the "isolated cartesian product theorem") that provides fresh
insight into the problem's mathematical structure. Our result implies that the
subgraph enumeration problem, where the goal is to report all the occurrences
of a constant-sized subgraph pattern, can be settled optimally (up to a
polylogarithmic factor) in the MPC model
Tight Distributed Listing of Cliques
Much progress has recently been made in understanding the complexity
landscape of subgraph finding problems in the CONGEST model of distributed
computing. However, so far, very few tight bounds are known in this area. For
triangle (i.e., 3-clique) listing, an optimal -round
distributed algorithm has been constructed by Chang et al.~[SODA 2019, PODC
2019]. Recent works of Eden et al.~[DISC 2019] and of Censor-Hillel et
al.~[PODC 2020] have shown sublinear algorithms for -listing, for each , but still leaving a significant gap between the upper bounds and the
known lower bounds of the problem.
In this paper, we completely close this gap. We show that for each , there is an -round distributed algorithm that lists
all -cliques in the communication network. Our algorithm is
\emph{optimal} up to a polylogarithmic factor, due to the -round lower bound of Fischer et al.~[SPAA 2018], which holds even in
the CONGESTED CLIQUE model. Together with the triangle-listing algorithm by
Chang et al.~[SODA 2019, PODC 2019], our result thus shows that the round
complexity of -listing, for all , is the same in both the CONGEST and
CONGESTED CLIQUE models, at rounds.
For , our result additionally matches the
lower bound for -\emph{detection} by Czumaj and Konrad [DISC 2018],
implying that the round complexities for detection and listing of are
equivalent in the CONGEST model.Comment: 21 pages. To appear in SODA 202
- …