18,532 research outputs found
Static Data Structure Lower Bounds Imply Rigidity
We show that static data structure lower bounds in the group (linear) model
imply semi-explicit lower bounds on matrix rigidity. In particular, we prove
that an explicit lower bound of on the cell-probe
complexity of linear data structures in the group model, even against
arbitrarily small linear space , would already imply a
semi-explicit () construction of rigid matrices with
significantly better parameters than the current state of art (Alon, Panigrahy
and Yekhanin, 2009). Our results further assert that polynomial () data structure lower bounds against near-optimal space, would
imply super-linear circuit lower bounds for log-depth linear circuits (a
four-decade open question). In the succinct space regime , we show
that any improvement on current cell-probe lower bounds in the linear model
would also imply new rigidity bounds. Our results rely on a new connection
between the "inner" and "outer" dimensions of a matrix (Paturi and Pudlak,
2006), and on a new reduction from worst-case to average-case rigidity, which
is of independent interest
Equivalence of Systematic Linear Data Structures and Matrix Rigidity
Recently, Dvir, Golovnev, and Weinstein have shown that sufficiently strong
lower bounds for linear data structures would imply new bounds for rigid
matrices. However, their result utilizes an algorithm that requires an
oracle, and hence, the rigid matrices are not explicit. In this work, we derive
an equivalence between rigidity and the systematic linear model of data
structures. For the -dimensional inner product problem with queries, we
prove that lower bounds on the query time imply rigidity lower bounds for the
query set itself. In particular, an explicit lower bound of
for redundant storage bits would
yield better rigidity parameters than the best bounds due to Alon, Panigrahy,
and Yekhanin. We also prove a converse result, showing that rigid matrices
directly correspond to hard query sets for the systematic linear model. As an
application, we prove that the set of vectors obtained from rank one binary
matrices is rigid with parameters matching the known results for explicit sets.
This implies that the vector-matrix-vector problem requires query time
for redundancy in the systematic linear
model, improving a result of Chakraborty, Kamma, and Larsen. Finally, we prove
a cell probe lower bound for the vector-matrix-vector problem in the high error
regime, improving a result of Chattopadhyay, Kouck\'{y}, Loff, and
Mukhopadhyay.Comment: 23 pages, 1 tabl
Probabilistic Polynomials and Hamming Nearest Neighbors
We show how to compute any symmetric Boolean function on variables over
any field (as well as the integers) with a probabilistic polynomial of degree
and error at most . The degree
dependence on and is optimal, matching a lower bound of Razborov
(1987) and Smolensky (1987) for the MAJORITY function. The proof is
constructive: a low-degree polynomial can be efficiently sampled from the
distribution.
This polynomial construction is combined with other algebraic ideas to give
the first subquadratic time algorithm for computing a (worst-case) batch of
Hamming distances in superlogarithmic dimensions, exactly. To illustrate, let
. Suppose we are given a database
of vectors in and a collection of query vectors
in the same dimension. For all , we wish to compute a
with minimum Hamming distance from . We solve this problem in randomized time. Hence, the problem is in "truly subquadratic"
time for dimensions, and in subquadratic time for . We apply the algorithm to computing pairs with maximum
inner product, closest pair in for vectors with bounded integer
entries, and pairs with maximum Jaccard coefficients.Comment: 16 pages. To appear in 56th Annual IEEE Symposium on Foundations of
Computer Science (FOCS 2015
A Lower Bound for Sampling Disjoint Sets
Suppose Alice and Bob each start with private randomness and no other input, and they wish to engage in a protocol in which Alice ends up with a set x subseteq[n] and Bob ends up with a set y subseteq[n], such that (x,y) is uniformly distributed over all pairs of disjoint sets. We prove that for some constant beta0 of the uniform distribution over all pairs of disjoint sets of size sqrt{n}
Data Structures Lower Bounds and Popular Conjectures
In this paper, we investigate the relative power of several conjectures that attracted recently lot of interest. We establish a connection between the Network Coding Conjecture (NCC) of Li and Li [Li and Li, 2004] and several data structure problems such as non-adaptive function inversion of Hellman [M. Hellman, 1980] and the well-studied problem of polynomial evaluation and interpolation. In turn these data structure problems imply super-linear circuit lower bounds for explicit functions such as integer sorting and multi-point polynomial evaluation
Tight Lower Bounds for Data-Dependent Locality-Sensitive Hashing
We prove a tight lower bound for the exponent for data-dependent
Locality-Sensitive Hashing schemes, recently used to design efficient solutions
for the -approximate nearest neighbor search. In particular, our lower bound
matches the bound of for the space,
obtained via the recent algorithm from [Andoni-Razenshteyn, STOC'15].
In recent years it emerged that data-dependent hashing is strictly superior
to the classical Locality-Sensitive Hashing, when the hash function is
data-independent. In the latter setting, the best exponent has been already
known: for the space, the tight bound is , with the upper
bound from [Indyk-Motwani, STOC'98] and the matching lower bound from
[O'Donnell-Wu-Zhou, ITCS'11].
We prove that, even if the hashing is data-dependent, it must hold that
. To prove the result, we need to formalize the
exact notion of data-dependent hashing that also captures the complexity of the
hash functions (in addition to their collision properties). Without restricting
such complexity, we would allow for obviously infeasible solutions such as the
Voronoi diagram of a dataset. To preclude such solutions, we require our hash
functions to be succinct. This condition is satisfied by all the known
algorithmic results.Comment: 16 pages, no figure
An Optimal Lower Bound on the Communication Complexity of Gap-Hamming-Distance
We prove an optimal lower bound on the randomized communication
complexity of the much-studied Gap-Hamming-Distance problem. As a consequence,
we obtain essentially optimal multi-pass space lower bounds in the data stream
model for a number of fundamental problems, including the estimation of
frequency moments.
The Gap-Hamming-Distance problem is a communication problem, wherein Alice
and Bob receive -bit strings and , respectively. They are promised
that the Hamming distance between and is either at least
or at most , and their goal is to decide which of these is the
case. Since the formal presentation of the problem by Indyk and Woodruff (FOCS,
2003), it had been conjectured that the naive protocol, which uses bits of
communication, is asymptotically optimal. The conjecture was shown to be true
in several special cases, e.g., when the communication is deterministic, or
when the number of rounds of communication is limited.
The proof of our aforementioned result, which settles this conjecture fully,
is based on a new geometric statement regarding correlations in Gaussian space,
related to a result of C. Borell (1985). To prove this geometric statement, we
show that random projections of not-too-small sets in Gaussian space are close
to a mixture of translated normal variables
- …