11 research outputs found
Optimal lower bounds for universal relation, and for samplers and finding duplicates in streams
In the communication problem (universal relation) [KRW95],
Alice and Bob respectively receive with the promise that
. The last player to receive a message must output an index such
that . We prove that the randomized one-way communication
complexity of this problem in the public coin model is exactly
for failure
probability . Our lower bound holds even if promised
. As a corollary, we obtain
optimal lower bounds for -sampling in strict turnstile streams for
, as well as for the problem of finding duplicates in a stream. Our
lower bounds do not need to use large weights, and hold even if promised
at all points in the stream.
We give two different proofs of our main result. The first proof demonstrates
that any algorithm solving sampling problems in turnstile streams
in low memory can be used to encode subsets of of certain sizes into a
number of bits below the information theoretic minimum. Our encoder makes
adaptive queries to throughout its execution, but done carefully
so as to not violate correctness. This is accomplished by injecting random
noise into the encoder's interactions with , which is loosely
motivated by techniques in differential privacy. Our second proof is via a
novel randomized reduction from Augmented Indexing [MNSW98] which needs to
interact with adaptively. To handle the adaptivity we identify
certain likely interaction patterns and union bound over them to guarantee
correct interaction on all of them. To guarantee correctness, it is important
that the interaction hides some of its randomness from in the
reduction.Comment: merge of arXiv:1703.08139 and of work of Kapralov, Woodruff, and
Yahyazade
The Sketching Complexity of Graph and Hypergraph Counting
Subgraph counting is a fundamental primitive in graph processing, with
applications in social network analysis (e.g., estimating the clustering
coefficient of a graph), database processing and other areas. The space
complexity of subgraph counting has been studied extensively in the literature,
but many natural settings are still not well understood. In this paper we
revisit the subgraph (and hypergraph) counting problem in the sketching model,
where the algorithm's state as it processes a stream of updates to the graph is
a linear function of the stream. This model has recently received a lot of
attention in the literature, and has become a standard model for solving
dynamic graph streaming problems.
In this paper we give a tight bound on the sketching complexity of counting
the number of occurrences of a small subgraph in a bounded degree graph
presented as a stream of edge updates. Specifically, we show that the space
complexity of the problem is governed by the fractional vertex cover number of
the graph . Our subgraph counting algorithm implements a natural vertex
sampling approach, with sampling probabilities governed by the vertex cover of
. Our main technical contribution lies in a new set of Fourier analytic
tools that we develop to analyze multiplayer communication protocols in the
simultaneous communication model, allowing us to prove a tight lower bound. We
believe that our techniques are likely to find applications in other settings.
Besides giving tight bounds for all graphs , both our algorithm and lower
bounds extend to the hypergraph setting, albeit with some loss in space
complexity
Graph Sketching Against Adaptive Adversaries Applied to the Minimum Degree Algorithm
Motivated by the study of matrix elimination orderings in combinatorial
scientific computing, we utilize graph sketching and local sampling to give a
data structure that provides access to approximate fill degrees of a matrix
undergoing elimination in time per elimination and
query. We then study the problem of using this data structure in the minimum
degree algorithm, which is a widely-used heuristic for producing elimination
orderings for sparse matrices by repeatedly eliminating the vertex with
(approximate) minimum fill degree. This leads to a nearly-linear time algorithm
for generating approximate greedy minimum degree orderings. Despite extensive
studies of algorithms for elimination orderings in combinatorial scientific
computing, our result is the first rigorous incorporation of randomized tools
in this setting, as well as the first nearly-linear time algorithm for
producing elimination orderings with provable approximation guarantees.
While our sketching data structure readily works in the oblivious adversary
model, by repeatedly querying and greedily updating itself, it enters the
adaptive adversarial model where the underlying sketches become prone to
failure due to dependency issues with their internal randomness. We show how to
use an additional sampling procedure to circumvent this problem and to create
an independent access sequence. Our technique for decorrelating the interleaved
queries and updates to this randomized data structure may be of independent
interest.Comment: 58 pages, 3 figures. This is a substantially revised version of
arXiv:1711.08446 with an emphasis on the underlying theoretical problem