Search CORE

10 research outputs found

An Optimal Lower Bound on the Communication Complexity of Gap-Hamming-Distance

Author: Chakrabarti Amit
Regev Oded
Publication venue
Publication date: 01/01/2010
Field of study

We prove an optimal

\Omega(n)

lower bound on the randomized communication complexity of the much-studied Gap-Hamming-Distance problem. As a consequence, we obtain essentially optimal multi-pass space lower bounds in the data stream model for a number of fundamental problems, including the estimation of frequency moments. The Gap-Hamming-Distance problem is a communication problem, wherein Alice and Bob receive

n

-bit strings

x

and

y

, respectively. They are promised that the Hamming distance between

x

and

y

is either at least

n/2+\sqrt{n}

or at most

n/2-\sqrt{n}

, and their goal is to decide which of these is the case. Since the formal presentation of the problem by Indyk and Woodruff (FOCS, 2003), it had been conjectured that the naive protocol, which uses

n

bits of communication, is asymptotically optimal. The conjecture was shown to be true in several special cases, e.g., when the communication is deterministic, or when the number of rounds of communication is limited. The proof of our aforementioned result, which settles this conjecture fully, is based on a new geometric statement regarding correlations in Gaussian space, related to a result of C. Borell (1985). To prove this geometric statement, we show that random projections of not-too-small sets in Gaussian space are close to a mixture of translated normal variables

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Dartmouth Digital Commons (Dartmouth College)

Stochastic Streams: Sample Complexity vs. Space Complexity

Author: Crouch Michael
McGregor Andrew
Valiant Gregory
Woodruff David P.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 24th Annual European Symposium on Algorithms (ESA 2016)
Publication date: 01/01/2016
Field of study

We address the trade-off between the computational resources needed to process a large data set and the number of samples available from the data set. Specifically, we consider the following abstraction: we receive a potentially infinite stream of IID samples from some unknown distribution D, and are tasked with computing some function f(D). If the stream is observed for time t, how much memory, s, is required to estimate f(D)? We refer to t as the sample complexity and s as the space complexity. The main focus of this paper is investigating the trade-offs between the space and sample complexity. We study these trade-offs for several canonical problems studied in the data stream model: estimating the collision probability, i.e., the second moment of a distribution, deciding if a graph is connected, and approximating the dimension of an unknown subspace. Our results are based on techniques for simulating different classical sampling procedures in this model, emulating random walks given a sequence of IID samples, as well as leveraging a characterization between communication bounded protocols and statistical query algorithms

Dagstuhl Research Online Publication Server

Communication Complexity of Inner Product in Symmetric Normed Spaces

Author: Andoni Alexandr
Błasiok Jarosław
Filtser Arnold
Publication venue
Publication date: 24/11/2022
Field of study

We introduce and study the communication complexity of computing the inner product of two vectors, where the input is restricted w.r.t. a norm

N

on the space

\mathbb{R}^n

. Here, Alice and Bob hold two vectors

v,u

such that

\|v\|_N\le 1

and

\|u\|_{N^*}\le 1

, where

N^*

is the dual norm. They want to compute their inner product

\langle v,u \rangle

up to an

\varepsilon

additive term. The problem is denoted by

\mathrm{IP}_N

. We systematically study

\mathrm{IP}_N

, showing the following results: - For any symmetric norm

N

, given

\|v\|_N\le 1

and

\|u\|_{N^*}\le 1

there is a randomized protocol for

\mathrm{IP}_N

using

\tilde{\mathcal{O}}(\varepsilon^{-6} \log n)

bits -- we will denote this by

\mathcal{R}_{\varepsilon,1/3}(\mathrm{IP}_{N}) \leq \tilde{\mathcal{O}}(\varepsilon^{-6} \log n)

. - One way communication complexity

\overrightarrow{\mathcal{R}}(\mathrm{IP}_{\ell_p})\leq\mathcal{O}(\varepsilon^{-\max(2,p)}\cdot \log\frac n\varepsilon)

, and a nearly matching lower bound

\overrightarrow{\mathcal{R}}(\mathrm{IP}_{\ell_p}) \geq \Omega(\varepsilon^{-\max(2,p)})

for

\varepsilon^{-\max(2,p)} \ll n

. - One way communication complexity

\overrightarrow{\mathcal{R}}(N)

for a symmetric norm

N

is governed by embeddings

\ell_\infty^k

into

N

. Specifically, while a small distortion embedding easily implies a lower bound

\Omega(k)

, we show that, conversely, non-existence of such an embedding implies protocol with communication

k^{\mathcal{O}(\log \log k)} \log^2 n

. - For arbitrary origin symmetric convex polytope

P

, we show

\mathcal{R}(\mathrm{IP}_{N}) \le\mathcal{O}(\varepsilon^{-2} \log \mathrm{xc}(P))

, where

N

is the unique norm for which

P

is a unit ball, and

\mathrm{xc}(P)

is the extension complexity of

P

.Comment: Accepted to ITCS 202

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Communication Complexity of Inner Product in Symmetric Normed Spaces

Author: Andoni Alexandr
B?asiok Jaros?aw
Filtser Arnold
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)
Publication date: 01/01/2023
Field of study

Dagstuhl Research Online Publication Server

Robust lower bounds for communication and stream computation

Author: Chakrabarti Amit
Cormode Graham
McGregor Andrew
Publication venue: 'Theory of Computing Exchange'
Publication date: 28/08/2016
Field of study

We study the communication complexity of evaluating functions when the input data is randomly allocated (according to some known distribution) amongst two or more players, possibly with information overlap. This naturally extends previously studied variable partition models such as the best-case and worst-case partition models. We aim to understand whether the hardness of a communication problem holds for almost every allocation of the input, as opposed to holding for perhaps just a few atypical partitions. A key application is to the heavily studied data stream model. There is a strong connection between our communication lower bounds and lower bounds in the data stream model that are “robust” to the ordering of the data. That is, we prove lower bounds for when the order of the items in the stream is chosen not adversarially but rather uniformly (or near-uniformly) from the set of all permutations. This random-order data stream model has attracted recent interest, since lower bounds here give stronger evidence for the inherent hardness of streaming problems. Our results include the first random-partition communication lower bounds for problems including multi-party set disjointness and gap-Hamming-distance. Both are tight. We also extend and improve previous results for a form of pointer jumping that is relevant to the problem of selection (in particular, median finding). Collectively, these results yield lower bounds for a variety of problems in the random-order data stream model, including estimating the number of distinct elements, approximating frequency moments, and quantile estimation. A short version of this article is available in the Proceedings of the 40th Annual ACM Symposium on Theory of Computing (STOC'08), ACM, pp. 641-650. Compared to the conference presentation, this version considerably expands the detail of the discussion and in the proofs, and substantially changes some of the proof techniques

Crossref

Warwick Research Archives Portal Repository

Some Communication Complexity Results and their Applications

Author: Brody Joshua E
Publication venue: Dartmouth Digital Commons
Publication date: 01/11/2010
Field of study

Communication Complexity represents one of the premier techniques for proving lower bounds in theoretical computer science. Lower bounds on communication problems can be leveraged to prove lower bounds in several different areas. In this work, we study three different communication complexity problems. The lower bounds for these problems have applications in circuit complexity, wireless sensor networks, and streaming algorithms. First, we study the multiparty pointer jumping problem. We present the first nontrivial upper bound for this problem. We also provide a suite of strong lower bounds under several restricted classes of protocols. Next, we initiate the study of several non-monotone functions in the distributed functional monitoring setting and provide several lower bounds. In particular, we give a generic adversarial technique and show that when deletions are allowed, no nontrivial protocol is possible. Finally, we study the Gap-Hamming-Distance problem and give tight lower bounds for protocols that use a constant number of messages. As a result, we take a well-known lower bound for one-pass streaming algorithms for a host of problems and extend it so it applies to streaming algorithms that use a constant number of passes

Dartmouth Digital Commons (Dartmouth College)

LIPIcs, Volume 251, ITCS 2023, Complete Volume

Author: Tauman Kalai Yael
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 14th Innovations in Theoretical Computer Science Conference (ITCS 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 251, ITCS 2023, Complete Volum

Dagstuhl Research Online Publication Server

The average-case complexity of counting distinct elements

Author: David P. Woodruff
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2009
Field of study

We continue the study of approximating the number of distinct elements in a data stream of length n to within a (1±ɛ) factor. It is known that if the stream may consist of arbitrary data arriving in an arbitrary order, then any 1-pass algorithm requires Ω(1/ɛ 2) bits of space to perform this task. To try to bypass this lower bound, the problem was recently studied in a model in which the stream may consist of arbitrary data, but it arrives to the algorithm in a random order. However, even in this model an Ω(1/ɛ 2) lower bound was established. This is because the adversary can still choose the data arbitrarily. This leaves open the possibility that the problem is only hard under a pathological choice of data, which would be of little practical relevance. We study the average-case complexity of this problem under certain distributions. Namely, we study the case when each successive stream item is drawn independently and uniformly at random from an unknown subset of d items for an unknown value of d. This captures the notion of random uncorrelated data. For a wide range of values of d and n, we design a 1-pass algorithm that bypasses the Ω(1/ε 2) lower bound that holds in the adversarial and random-order models, thereby showing that this model admits more space-efficient algorithms. Moreover, the update time of our algorithm is optimal. Despite these positive results, for a certain range of values of d and n we show that estimating the number of distinct elements requires Ω(1/ε 2) bits of space even in this model. Our lower bound subsumes previous bounds, showing that even for natural choices of data the problem is hard

CiteSeerX

Crossref