946 research outputs found
Communication Steps for Parallel Query Processing
We consider the problem of computing a relational query on a large input
database of size , using a large number of servers. The computation is
performed in rounds, and each server can receive only
bits of data, where is a parameter that controls
replication. We examine how many global communication steps are needed to
compute . We establish both lower and upper bounds, in two settings. For a
single round of communication, we give lower bounds in the strongest possible
model, where arbitrary bits may be exchanged; we show that any algorithm
requires , where is the fractional vertex
cover of the hypergraph of . We also give an algorithm that matches the
lower bound for a specific class of databases. For multiple rounds of
communication, we present lower bounds in a model where routing decisions for a
tuple are tuple-based. We show that for the class of tree-like queries there
exists a tradeoff between the number of rounds and the space exponent
. The lower bounds for multiple rounds are the first of their
kind. Our results also imply that transitive closure cannot be computed in O(1)
rounds of communication
Approximation Algorithms for the Joint Replenishment Problem with Deadlines
The Joint Replenishment Problem (JRP) is a fundamental optimization problem
in supply-chain management, concerned with optimizing the flow of goods from a
supplier to retailers. Over time, in response to demands at the retailers, the
supplier ships orders, via a warehouse, to the retailers. The objective is to
schedule these orders to minimize the sum of ordering costs and retailers'
waiting costs.
We study the approximability of JRP-D, the version of JRP with deadlines,
where instead of waiting costs the retailers impose strict deadlines. We study
the integrality gap of the standard linear-program (LP) relaxation, giving a
lower bound of 1.207, a stronger, computer-assisted lower bound of 1.245, as
well as an upper bound and approximation ratio of 1.574. The best previous
upper bound and approximation ratio was 1.667; no lower bound was previously
published. For the special case when all demand periods are of equal length we
give an upper bound of 1.5, a lower bound of 1.2, and show APX-hardness
Locally Optimal Load Balancing
This work studies distributed algorithms for locally optimal load-balancing:
We are given a graph of maximum degree , and each node has up to
units of load. The task is to distribute the load more evenly so that the loads
of adjacent nodes differ by at most .
If the graph is a path (), it is easy to solve the fractional
version of the problem in communication rounds, independently of the
number of nodes. We show that this is tight, and we show that it is possible to
solve also the discrete version of the problem in rounds in paths.
For the general case (), we show that fractional load balancing
can be solved in rounds and discrete load
balancing in rounds for some function , independently of the
number of nodes.Comment: 19 pages, 11 figure
Distributed Connectivity Decomposition
We present time-efficient distributed algorithms for decomposing graphs with
large edge or vertex connectivity into multiple spanning or dominating trees,
respectively. As their primary applications, these decompositions allow us to
achieve information flow with size close to the connectivity by parallelizing
it along the trees. More specifically, our distributed decomposition algorithms
are as follows:
(I) A decomposition of each undirected graph with vertex-connectivity
into (fractionally) vertex-disjoint weighted dominating trees with total weight
, in rounds.
(II) A decomposition of each undirected graph with edge-connectivity
into (fractionally) edge-disjoint weighted spanning trees with total
weight , in
rounds.
We also show round complexity lower bounds of
and
for the above two decompositions,
using techniques of [Das Sarma et al., STOC'11]. Moreover, our
vertex-connectivity decomposition extends to centralized algorithms and
improves the time complexity of [Censor-Hillel et al., SODA'14] from
to near-optimal .
As corollaries, we also get distributed oblivious routing broadcast with
-competitive edge-congestion and -competitive
vertex-congestion. Furthermore, the vertex connectivity decomposition leads to
near-time-optimal -approximation of vertex connectivity: centralized
and distributed . The former moves
toward the 1974 conjecture of Aho, Hopcroft, and Ullman postulating an
centralized exact algorithm while the latter is the first distributed vertex
connectivity approximation
Compressed Representations of Conjunctive Query Results
Relational queries, and in particular join queries, often generate large
output results when executed over a huge dataset. In such cases, it is often
infeasible to store the whole materialized output if we plan to reuse it
further down a data processing pipeline. Motivated by this problem, we study
the construction of space-efficient compressed representations of the output of
conjunctive queries, with the goal of supporting the efficient access of the
intermediate compressed result for a given access pattern. In particular, we
initiate the study of an important tradeoff: minimizing the space necessary to
store the compressed result, versus minimizing the answer time and delay for an
access request over the result. Our main contribution is a novel parameterized
data structure, which can be tuned to trade off space for answer time. The
tradeoff allows us to control the space requirement of the data structure
precisely, and depends both on the structure of the query and the access
pattern. We show how we can use the data structure in conjunction with query
decomposition techniques, in order to efficiently represent the outputs for
several classes of conjunctive queries.Comment: To appear in PODS'18; 35 pages; comments welcom
- …