77 research outputs found

    Worst-Case Optimal Algorithms for Parallel Query Processing

    Get PDF
    In this paper, we study the communication complexity for the problem of computing a conjunctive query on a large database in a parallel setting with pp servers. In contrast to previous work, where upper and lower bounds on the communication were specified for particular structures of data (either data without skew, or data with specific types of skew), in this work we focus on worst-case analysis of the communication cost. The goal is to find worst-case optimal parallel algorithms, similar to the work of [18] for sequential algorithms. We first show that for a single round we can obtain an optimal worst-case algorithm. The optimal load for a conjunctive query qq when all relations have size equal to MM is O(M/p1/ψ)O(M/p^{1/\psi^*}), where ψ\psi^* is a new query-related quantity called the edge quasi-packing number, which is different from both the edge packing number and edge cover number of the query hypergraph. For multiple rounds, we present algorithms that are optimal for several classes of queries. Finally, we show a surprising connection to the external memory model, which allows us to translate parallel algorithms to external memory algorithms. This technique allows us to recover (within a polylogarithmic factor) several recent results on the I/O complexity for computing join queries, and also obtain optimal algorithms for other classes of queries

    Comparing MapReduce and pipeline implementations for counting triangles

    Get PDF
    A generalized method to define the Divide & Conquer paradigm in order to have processors acting on its own data and scheduled in a parallel fashion. MapReduce is a programming model that follows this paradigm, and allows for the definition of efficient solutions by both decomposing a problem into steps on subsets of the input data and combining the results of each step to produce final results. Albeit used for the implementation of a wide variety of computational problems, MapReduce performance can be negatively affected whenever the replication factor grows or the size of the input is larger than the resources available at each processor. In this paper we show an alternative approach to implement the Divide & Conquer paradigm, named pipeline. The main features of pipeline are illustrated on a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To evaluate the properties of pipeline, a dynamic pipeline of processes and an ad-hoc version of MapReduce are implemented in the language Go, exploiting its ability to deal with channels and spawned processes. An empirical evaluation is conducted on graphs of different sizes and densities. Observed results suggest that pipeline allows for the implementation of an efficient solution of the problem of counting triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.Peer ReviewedPostprint (published version

    MapReduce vs. pipelining counting triangles

    Get PDF
    In this paper we follow an alternative approach named pipeline, to implement a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To be concrete, we implement a dynamic pipeline of processes and an ad-hoc MapReduce version using the language Go. We explote the ability of Go language to deal with channels and spawned processes. An empirical evaluation is conducted on graphs of different size and density. Observed results suggest that pipeline allows for the implementation of an efficient solution of the problem of counting triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.Peer ReviewedPostprint (published version

    Enumerating Subgraphs of Constant Sizes in External Memory

    Get PDF
    We present an indivisible I/O-efficient algorithm for subgraph enumeration, where the objective is to list all the subgraphs of a massive graph G : = (V, E) that are isomorphic to a pattern graph Q having k = O(1) vertices. Our algorithm performs O((|E|^{k/2})/(M^{{k/2}-1} B) log_{M/B}(|E|/B) + (|E|^?)/(M^{?-1} B) I/Os with high probability, where ? is the fractional edge covering number of Q (it always holds ? ? k/2, regardless of Q), M is the number of words in (internal) memory, and B is the number of words in a disk block. Our solution is optimal in the class of indivisible algorithms for all pattern graphs with ? > k/2. When ? = k/2, our algorithm is still optimal as long as M/B ? (|E|/B)^? for any constant ? > 0

    A Simple Parallel Algorithm for Natural Joins on Binary Relations

    Get PDF

    Fully dynamic evaluation for conjunctive queries with free access patterns

    Get PDF
    We study the problem of answering conjunctive queries with free access patterns under updates. A free access pattern is a partition of the free variables of the query into input and output. The query returns tuples over the output variables given a tuple of values over the input variables. We introduce a fully dynamic evaluation approach for such queries. It is fully dynamic in the sense that it supports both inserts and deletes of tuples to the input relations. Our approach computes a data structure that supports the enumeration of the output tuples and maintains it under single-tuple updates to the input data. We also give a syntactic characterization of those queries that admit constant time per single-tuple update and whose output tuples can be enumerated with constant delay given an input tuple. Finally, for triangle and hierarchical queries with free access patterns, we chart the complexity trade-offs between the preprocessing time, update time and enumeration delay for such queries. The trade-offs are strongly or weakly Pareto optimal for triangle and a class of hierarchical queries. Their optimality is predicated on the Online Boolean Matrix-Vector Multiplication conjecture

    A Near-Optimal Parallel Algorithm for Joining Binary Relations

    Get PDF
    We present a constant-round algorithm in the massively parallel computation (MPC) model for evaluating a natural join where every input relation has two attributes. Our algorithm achieves a load of O~(m/p1/ρ)\tilde{O}(m/p^{1/\rho}) where mm is the total size of the input relations, pp is the number of machines, ρ\rho is the join's fractional edge covering number, and O~(.)\tilde{O}(.) hides a polylogarithmic factor. The load matches a known lower bound up to a polylogarithmic factor. At the core of the proposed algorithm is a new theorem (which we name {\em the isolated cartesian product theorem}) that provides fresh insight into the problem's mathematical structure. Our result implies that the {\em subgraph enumeration problem}, where the goal is to report all the occurrences of a constant-sized subgraph pattern, can be settled optimally (up to a polylogarithmic factor) in the MPC model.Comment: Short versions of this article appeared in PODS'17 and ICDT'20. The article is under submission to a journal. The red sentences are highlighted for the journal's reviewer

    Comparing MapReduce and pipeline implementations for counting triangles

    Get PDF
    A common method to define a parallel solution for a computational problem consists in finding a way to use the Divide and Conquer paradigm in order to have processors acting on its own data and scheduled in a parallel fashion. MapReduce is a programming model that follows this paradigm, and allows for the definition of efficient solutions by both decomposing a problem into steps on subsets of the input data and combining the results of each step to produce final results. Albeit used for the implementation of a wide variety of computational problems, MapReduce performance can be negatively affected whenever the replication factor grows or the size of the input is larger than the resources available at each processor. In this paper we show an alternative approach to implement the Divide and Conquer paradigm, named dynamic pipeline. The main features of dynamic pipelines are illustrated on a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To evaluate the properties of pipeline, a dynamic pipeline of processes and an ad-hoc version of MapReduce are implemented in the language Go, exploiting its ability to deal with channels and spawned processes. An empirical evaluation is conducted on graphs of different topologies, sizes, and densities. Observed results suggest that dynamic pipelines allows for an efficient implementation of the problem of counting triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.Peer ReviewedPostprint (published version

    A Near-Optimal Parallel Algorithm for Joining Binary Relations

    Get PDF
    We present a constant-round algorithm in the massively parallel computation (MPC) model for evaluating a natural join where every input relation has two attributes. Our algorithm achieves a load of O~(m/p1/ρ)\tilde{O}(m/p^{1/\rho}) where mm is the total size of the input relations, pp is the number of machines, ρ\rho is the join's fractional edge covering number, and O~(.)\tilde{O}(.) hides a polylogarithmic factor. The load matches a known lower bound up to a polylogarithmic factor. At the core of the proposed algorithm is a new theorem (which we name the "isolated cartesian product theorem") that provides fresh insight into the problem's mathematical structure. Our result implies that the subgraph enumeration problem, where the goal is to report all the occurrences of a constant-sized subgraph pattern, can be settled optimally (up to a polylogarithmic factor) in the MPC model

    Tight Distributed Listing of Cliques

    Full text link
    Much progress has recently been made in understanding the complexity landscape of subgraph finding problems in the CONGEST model of distributed computing. However, so far, very few tight bounds are known in this area. For triangle (i.e., 3-clique) listing, an optimal O~(n1/3)\tilde{O}(n^{1/3})-round distributed algorithm has been constructed by Chang et al.~[SODA 2019, PODC 2019]. Recent works of Eden et al.~[DISC 2019] and of Censor-Hillel et al.~[PODC 2020] have shown sublinear algorithms for KpK_p-listing, for each p4p \geq 4, but still leaving a significant gap between the upper bounds and the known lower bounds of the problem. In this paper, we completely close this gap. We show that for each p4p \geq 4, there is an O~(n12/p)\tilde{O}(n^{1 - 2/p})-round distributed algorithm that lists all pp-cliques KpK_p in the communication network. Our algorithm is \emph{optimal} up to a polylogarithmic factor, due to the Ω~(n12/p)\tilde{\Omega}(n^{1 - 2/p})-round lower bound of Fischer et al.~[SPAA 2018], which holds even in the CONGESTED CLIQUE model. Together with the triangle-listing algorithm by Chang et al.~[SODA 2019, PODC 2019], our result thus shows that the round complexity of KpK_p-listing, for all pp, is the same in both the CONGEST and CONGESTED CLIQUE models, at Θ~(n12/p)\tilde{\Theta}(n^{1 - 2/p}) rounds. For p=4p=4, our result additionally matches the Ω~(n1/2)\tilde{\Omega}(n^{1/2}) lower bound for K4K_4-\emph{detection} by Czumaj and Konrad [DISC 2018], implying that the round complexities for detection and listing of K4K_4 are equivalent in the CONGEST model.Comment: 21 pages. To appear in SODA 202