946 research outputs found

    Communication Steps for Parallel Query Processing

    Full text link
    We consider the problem of computing a relational query qq on a large input database of size nn, using a large number pp of servers. The computation is performed in rounds, and each server can receive only O(n/p1ε)O(n/p^{1-\varepsilon}) bits of data, where ε[0,1]\varepsilon \in [0,1] is a parameter that controls replication. We examine how many global communication steps are needed to compute qq. We establish both lower and upper bounds, in two settings. For a single round of communication, we give lower bounds in the strongest possible model, where arbitrary bits may be exchanged; we show that any algorithm requires ε11/τ\varepsilon \geq 1-1/\tau^*, where τ\tau^* is the fractional vertex cover of the hypergraph of qq. We also give an algorithm that matches the lower bound for a specific class of databases. For multiple rounds of communication, we present lower bounds in a model where routing decisions for a tuple are tuple-based. We show that for the class of tree-like queries there exists a tradeoff between the number of rounds and the space exponent ε\varepsilon. The lower bounds for multiple rounds are the first of their kind. Our results also imply that transitive closure cannot be computed in O(1) rounds of communication

    Approximation Algorithms for the Joint Replenishment Problem with Deadlines

    Get PDF
    The Joint Replenishment Problem (JRP) is a fundamental optimization problem in supply-chain management, concerned with optimizing the flow of goods from a supplier to retailers. Over time, in response to demands at the retailers, the supplier ships orders, via a warehouse, to the retailers. The objective is to schedule these orders to minimize the sum of ordering costs and retailers' waiting costs. We study the approximability of JRP-D, the version of JRP with deadlines, where instead of waiting costs the retailers impose strict deadlines. We study the integrality gap of the standard linear-program (LP) relaxation, giving a lower bound of 1.207, a stronger, computer-assisted lower bound of 1.245, as well as an upper bound and approximation ratio of 1.574. The best previous upper bound and approximation ratio was 1.667; no lower bound was previously published. For the special case when all demand periods are of equal length we give an upper bound of 1.5, a lower bound of 1.2, and show APX-hardness

    Locally Optimal Load Balancing

    Full text link
    This work studies distributed algorithms for locally optimal load-balancing: We are given a graph of maximum degree Δ\Delta, and each node has up to LL units of load. The task is to distribute the load more evenly so that the loads of adjacent nodes differ by at most 11. If the graph is a path (Δ=2\Delta = 2), it is easy to solve the fractional version of the problem in O(L)O(L) communication rounds, independently of the number of nodes. We show that this is tight, and we show that it is possible to solve also the discrete version of the problem in O(L)O(L) rounds in paths. For the general case (Δ>2\Delta > 2), we show that fractional load balancing can be solved in poly(L,Δ)\operatorname{poly}(L,\Delta) rounds and discrete load balancing in f(L,Δ)f(L,\Delta) rounds for some function ff, independently of the number of nodes.Comment: 19 pages, 11 figure

    Distributed Connectivity Decomposition

    Full text link
    We present time-efficient distributed algorithms for decomposing graphs with large edge or vertex connectivity into multiple spanning or dominating trees, respectively. As their primary applications, these decompositions allow us to achieve information flow with size close to the connectivity by parallelizing it along the trees. More specifically, our distributed decomposition algorithms are as follows: (I) A decomposition of each undirected graph with vertex-connectivity kk into (fractionally) vertex-disjoint weighted dominating trees with total weight Ω(klogn)\Omega(\frac{k}{\log n}), in O~(D+n)\widetilde{O}(D+\sqrt{n}) rounds. (II) A decomposition of each undirected graph with edge-connectivity λ\lambda into (fractionally) edge-disjoint weighted spanning trees with total weight λ12(1ε)\lceil\frac{\lambda-1}{2}\rceil(1-\varepsilon), in O~(D+nλ)\widetilde{O}(D+\sqrt{n\lambda}) rounds. We also show round complexity lower bounds of Ω~(D+nk)\tilde{\Omega}(D+\sqrt{\frac{n}{k}}) and Ω~(D+nλ)\tilde{\Omega}(D+\sqrt{\frac{n}{\lambda}}) for the above two decompositions, using techniques of [Das Sarma et al., STOC'11]. Moreover, our vertex-connectivity decomposition extends to centralized algorithms and improves the time complexity of [Censor-Hillel et al., SODA'14] from O(n3)O(n^3) to near-optimal O~(m)\tilde{O}(m). As corollaries, we also get distributed oblivious routing broadcast with O(1)O(1)-competitive edge-congestion and O(logn)O(\log n)-competitive vertex-congestion. Furthermore, the vertex connectivity decomposition leads to near-time-optimal O(logn)O(\log n)-approximation of vertex connectivity: centralized O~(m)\widetilde{O}(m) and distributed O~(D+n)\tilde{O}(D+\sqrt{n}). The former moves toward the 1974 conjecture of Aho, Hopcroft, and Ullman postulating an O(m)O(m) centralized exact algorithm while the latter is the first distributed vertex connectivity approximation

    Compressed Representations of Conjunctive Query Results

    Full text link
    Relational queries, and in particular join queries, often generate large output results when executed over a huge dataset. In such cases, it is often infeasible to store the whole materialized output if we plan to reuse it further down a data processing pipeline. Motivated by this problem, we study the construction of space-efficient compressed representations of the output of conjunctive queries, with the goal of supporting the efficient access of the intermediate compressed result for a given access pattern. In particular, we initiate the study of an important tradeoff: minimizing the space necessary to store the compressed result, versus minimizing the answer time and delay for an access request over the result. Our main contribution is a novel parameterized data structure, which can be tuned to trade off space for answer time. The tradeoff allows us to control the space requirement of the data structure precisely, and depends both on the structure of the query and the access pattern. We show how we can use the data structure in conjunction with query decomposition techniques, in order to efficiently represent the outputs for several classes of conjunctive queries.Comment: To appear in PODS'18; 35 pages; comments welcom
    corecore