1,154 research outputs found
Brief Announcement: Hamming Distance Completeness and Sparse Matrix Multiplication
We show that a broad class of (+, diamond) vector products (for binary integer functions diamond) are equivalent under one-to-polylog reductions to the computation of the Hamming distance. Examples include: the dominance product, the threshold product and l_{2p+1} distances for constant p. Our results imply equivalence (up to poly log n factors) between complexity of computation of All Pairs: Hamming Distances, l_{2p+1} Distances, Dominance Products and Threshold Products. As a consequence, Yuster\u27s (SODA\u2709) algorithm improves not only Matousek\u27s (IPL\u2791), but also the results of Indyk, Lewenstein, Lipsky and Porat (ICALP\u2704) and Min, Kao and Zhu (COCOON\u2709). Furthermore, our reductions apply to the pattern matching setting, showing equivalence (up to poly log n factors) between pattern matching under Hamming Distance, l_{2p+1} Distance, Dominance Product and Threshold Product, with current best upperbounds due to results of Abrahamson (SICOMP\u2787), Amir and Farach (Ann. Math. Artif. Intell.\u2791), Atallah and Duket (IPL\u2711), Clifford, Clifford and Iliopoulous (CPM\u2705) and Amir, Lipsky, Porat and Umanski (CPM\u2705). The resulting algorithms for l_{2p+1} Pattern Matching and All Pairs l_{2p+1}, for 2p+1 = 3,5,7,... are new.
Additionally, we show that the complexity of AllPairsHammingDistances (and thus of other aforementioned AllPairs- problems) is within poly log n from the time it takes to multiply matrices n x (n * d) and (n * d) x n, each with (n * d) non-zero entries. This means that the current upperbounds by Yuster (SODA\u2709) cannot be improved without improving the sparse matrix multiplication algorithm by Yuster and Zwick (ACM TALG\u2705) and vice versa
Probabilistic Polynomials and Hamming Nearest Neighbors
We show how to compute any symmetric Boolean function on variables over
any field (as well as the integers) with a probabilistic polynomial of degree
and error at most . The degree
dependence on and is optimal, matching a lower bound of Razborov
(1987) and Smolensky (1987) for the MAJORITY function. The proof is
constructive: a low-degree polynomial can be efficiently sampled from the
distribution.
This polynomial construction is combined with other algebraic ideas to give
the first subquadratic time algorithm for computing a (worst-case) batch of
Hamming distances in superlogarithmic dimensions, exactly. To illustrate, let
. Suppose we are given a database
of vectors in and a collection of query vectors
in the same dimension. For all , we wish to compute a
with minimum Hamming distance from . We solve this problem in randomized time. Hence, the problem is in "truly subquadratic"
time for dimensions, and in subquadratic time for . We apply the algorithm to computing pairs with maximum
inner product, closest pair in for vectors with bounded integer
entries, and pairs with maximum Jaccard coefficients.Comment: 16 pages. To appear in 56th Annual IEEE Symposium on Foundations of
Computer Science (FOCS 2015
On the hardness of learning sparse parities
This work investigates the hardness of computing sparse solutions to systems
of linear equations over F_2. Consider the k-EvenSet problem: given a
homogeneous system of linear equations over F_2 on n variables, decide if there
exists a nonzero solution of Hamming weight at most k (i.e. a k-sparse
solution). While there is a simple O(n^{k/2})-time algorithm for it,
establishing fixed parameter intractability for k-EvenSet has been a notorious
open problem. Towards this goal, we show that unless k-Clique can be solved in
n^{o(k)} time, k-EvenSet has no poly(n)2^{o(sqrt{k})} time algorithm and no
polynomial time algorithm when k = (log n)^{2+eta} for any eta > 0.
Our work also shows that the non-homogeneous generalization of the problem --
which we call k-VectorSum -- is W[1]-hard on instances where the number of
equations is O(k log n), improving on previous reductions which produced
Omega(n) equations. We also show that for any constant eps > 0, given a system
of O(exp(O(k))log n) linear equations, it is W[1]-hard to decide if there is a
k-sparse linear form satisfying all the equations or if every function on at
most k-variables (k-junta) satisfies at most (1/2 + eps)-fraction of the
equations. In the setting of computational learning, this shows hardness of
approximate non-proper learning of k-parities. In a similar vein, we use the
hardness of k-EvenSet to show that that for any constant d, unless k-Clique can
be solved in n^{o(k)} time there is no poly(m, n)2^{o(sqrt{k}) time algorithm
to decide whether a given set of m points in F_2^n satisfies: (i) there exists
a non-trivial k-sparse homogeneous linear form evaluating to 0 on all the
points, or (ii) any non-trivial degree d polynomial P supported on at most k
variables evaluates to zero on approx. Pr_{F_2^n}[P(z) = 0] fraction of the
points i.e., P is fooled by the set of points
Distributed PCP Theorems for Hardness of Approximation in P
We present a new distributed model of probabilistically checkable proofs
(PCP). A satisfying assignment to a CNF formula is
shared between two parties, where Alice knows , Bob knows
, and both parties know . The goal is to have
Alice and Bob jointly write a PCP that satisfies , while
exchanging little or no information. Unfortunately, this model as-is does not
allow for nontrivial query complexity. Instead, we focus on a non-deterministic
variant, where the players are helped by Merlin, a third party who knows all of
.
Using our framework, we obtain, for the first time, PCP-like reductions from
the Strong Exponential Time Hypothesis (SETH) to approximation problems in P.
In particular, under SETH we show that there are no truly-subquadratic
approximation algorithms for Bichromatic Maximum Inner Product over
{0,1}-vectors, Bichromatic LCS Closest Pair over permutations, Approximate
Regular Expression Matching, and Diameter in Product Metric. All our
inapproximability factors are nearly-tight. In particular, for the first two
problems we obtain nearly-polynomial factors of ; only
-factor lower bounds (under SETH) were known before
- …