2,181 research outputs found
Topology Dependent Bounds For FAQs
In this paper, we prove topology dependent bounds on the number of rounds
needed to compute Functional Aggregate Queries (FAQs) studied by Abo Khamis et
al. [PODS 2016] in a synchronous distributed network under the model considered
by Chattopadhyay et al. [FOCS 2014, SODA 2017]. Unlike the recent work on
computing database queries in the Massively Parallel Computation model, in the
model of Chattopadhyay et al., nodes can communicate only via private
point-to-point channels and we are interested in bounds that work over an {\em
arbitrary} communication topology. This is the first work to consider more
practically motivated problems in this distributed model. For the sake of
exposition, we focus on two special problems in this paper: Boolean Conjunctive
Query (BCQ) and computing variable/factor marginals in Probabilistic Graphical
Models (PGMs). We obtain tight bounds on the number of rounds needed to compute
such queries as long as the underlying hypergraph of the query is
-degenerate and has -arity. In particular, the -degeneracy
condition covers most well-studied queries that are efficiently computable in
the centralized computation model like queries with constant treewidth. These
tight bounds depend on a new notion of `width' (namely internal-node-width) for
Generalized Hypertree Decompositions (GHDs) of acyclic hypergraphs, which
minimizes the number of internal nodes in a sub-class of GHDs. To the best of
our knowledge, this width has not been studied explicitly in the theoretical
database literature. Finally, we consider the problem of computing the product
of a vector with a chain of matrices and prove tight bounds on its round
complexity (over the finite field of two elements) using a novel min-entropy
based argument.Comment: A conference version was presented at PODS 201
Tight Bounds for Asymptotic and Approximate Consensus
We study the performance of asymptotic and approximate consensus algorithms
under harsh environmental conditions. The asymptotic consensus problem requires
a set of agents to repeatedly set their outputs such that the outputs converge
to a common value within the convex hull of initial values. This problem, and
the related approximate consensus problem, are fundamental building blocks in
distributed systems where exact consensus among agents is not required or
possible, e.g., man-made distributed control systems, and have applications in
the analysis of natural distributed systems, such as flocking and opinion
dynamics. We prove tight lower bounds on the contraction rates of asymptotic
consensus algorithms in dynamic networks, from which we deduce bounds on the
time complexity of approximate consensus algorithms. In particular, the
obtained bounds show optimality of asymptotic and approximate consensus
algorithms presented in [Charron-Bost et al., ICALP'16] for certain dynamic
networks, including the weakest dynamic network model in which asymptotic and
approximate consensus are solvable. As a corollary we also obtain
asymptotically tight bounds for asymptotic consensus in the classical
asynchronous model with crashes.
Central to our lower bound proofs is an extended notion of valency, the set
of reachable limits of an asymptotic consensus algorithm starting from a given
configuration. We further relate topological properties of valencies to the
solvability of exact consensus, shedding some light on the relation of these
three fundamental problems in dynamic networks
On the Distributed Complexity of Large-Scale Graph Computations
Motivated by the increasing need to understand the distributed algorithmic
foundations of large-scale graph computations, we study some fundamental graph
problems in a message-passing model for distributed computing where
machines jointly perform computations on graphs with nodes (typically, ). The input graph is assumed to be initially randomly partitioned among
the machines, a common implementation in many real-world systems.
Communication is point-to-point, and the goal is to minimize the number of
communication {\em rounds} of the computation.
Our main contribution is the {\em General Lower Bound Theorem}, a theorem
that can be used to show non-trivial lower bounds on the round complexity of
distributed large-scale data computations. The General Lower Bound Theorem is
established via an information-theoretic approach that relates the round
complexity to the minimal amount of information required by machines to solve
the problem. Our approach is generic and this theorem can be used in a
"cookbook" fashion to show distributed lower bounds in the context of several
problems, including non-graph problems. We present two applications by showing
(almost) tight lower bounds for the round complexity of two fundamental graph
problems, namely {\em PageRank computation} and {\em triangle enumeration}. Our
approach, as demonstrated in the case of PageRank, can yield tight lower bounds
for problems (including, and especially, under a stochastic partition of the
input) where communication complexity techniques are not obvious.
Our approach, as demonstrated in the case of triangle enumeration, can yield
stronger round lower bounds as well as message-round tradeoffs compared to
approaches that use communication complexity techniques
Towards Optimal Moment Estimation in Streaming and Distributed Models
One of the oldest problems in the data stream model is to approximate the p-th moment ||X||_p^p = sum_{i=1}^n X_i^p of an underlying non-negative vector X in R^n, which is presented as a sequence of poly(n) updates to its coordinates. Of particular interest is when p in (0,2]. Although a tight space bound of Theta(epsilon^-2 log n) bits is known for this problem when both positive and negative updates are allowed, surprisingly there is still a gap in the space complexity of this problem when all updates are positive. Specifically, the upper bound is O(epsilon^-2 log n) bits, while the lower bound is only Omega(epsilon^-2 + log n) bits. Recently, an upper bound of O~(epsilon^-2 + log n) bits was obtained under the assumption that the updates arrive in a random order.
We show that for p in (0, 1], the random order assumption is not needed. Namely, we give an upper bound for worst-case streams of O~(epsilon^-2 + log n) bits for estimating |X |_p^p. Our techniques also give new upper bounds for estimating the empirical entropy in a stream. On the other hand, we show that for p in (1,2], in the natural coordinator and blackboard distributed communication topologies, there is an O~(epsilon^-2) bit max-communication upper bound based on a randomized rounding scheme. Our protocols also give rise to protocols for heavy hitters and approximate matrix product. We generalize our results to arbitrary communication topologies G, obtaining an O~(epsilon^2 log d) max-communication upper bound, where d is the diameter of G. Interestingly, our upper bound rules out natural communication complexity-based approaches for proving an Omega(epsilon^-2 log n) bit lower bound for p in (1,2] for streaming algorithms. In particular, any such lower bound must come from a topology with large diameter
Towards a complexity theory for the congested clique
The congested clique model of distributed computing has been receiving
attention as a model for densely connected distributed systems. While there has
been significant progress on the side of upper bounds, we have very little in
terms of lower bounds for the congested clique; indeed, it is now know that
proving explicit congested clique lower bounds is as difficult as proving
circuit lower bounds.
In this work, we use various more traditional complexity-theoretic tools to
build a clearer picture of the complexity landscape of the congested clique:
-- Nondeterminism and beyond: We introduce the nondeterministic congested
clique model (analogous to NP) and show that there is a natural canonical
problem family that captures all problems solvable in constant time with
nondeterministic algorithms. We further generalise these notions by introducing
the constant-round decision hierarchy (analogous to the polynomial hierarchy).
-- Non-constructive lower bounds: We lift the prior non-uniform counting
arguments to a general technique for proving non-constructive uniform lower
bounds for the congested clique. In particular, we prove a time hierarchy
theorem for the congested clique, showing that there are decision problems of
essentially all complexities, both in the deterministic and nondeterministic
settings.
-- Fine-grained complexity: We map out relationships between various natural
problems in the congested clique model, arguing that a reduction-based
complexity theory currently gives us a fairly good picture of the complexity
landscape of the congested clique
Optimal Dynamic Distributed MIS
Finding a maximal independent set (MIS) in a graph is a cornerstone task in
distributed computing. The local nature of an MIS allows for fast solutions in
a static distributed setting, which are logarithmic in the number of nodes or
in their degrees. The result trivially applies for the dynamic distributed
model, in which edges or nodes may be inserted or deleted. In this paper, we
take a different approach which exploits locality to the extreme, and show how
to update an MIS in a dynamic distributed setting, either \emph{synchronous} or
\emph{asynchronous}, with only \emph{a single adjustment} and in a single
round, in expectation. These strong guarantees hold for the \emph{complete
fully dynamic} setting: Insertions and deletions, of edges as well as nodes,
gracefully and abruptly. This strongly separates the static and dynamic
distributed models, as super-constant lower bounds exist for computing an MIS
in the former.
Our results are obtained by a novel analysis of the surprisingly simple
solution of carefully simulating the greedy \emph{sequential} MIS algorithm
with a random ordering of the nodes. As such, our algorithm has a direct
application as a -approximation algorithm for correlation clustering. This
adds to the important toolbox of distributed graph decompositions, which are
widely used as crucial building blocks in distributed computing.
Finally, our algorithm enjoys a useful \emph{history-independence} property,
meaning the output is independent of the history of topology changes that
constructed that graph. This means the output cannot be chosen, or even biased,
by the adversary in case its goal is to prevent us from optimizing some
objective function.Comment: 19 pages including appendix and reference
- …