Search CORE

1,154 research outputs found

Brief Announcement: Hamming Distance Completeness and Sparse Matrix Multiplication

Author: Graf Daniel
Labib Karim
Uznanski Przemyslaw
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)
Publication date: 01/01/2018
Field of study

We show that a broad class of (+, diamond) vector products (for binary integer functions diamond) are equivalent under one-to-polylog reductions to the computation of the Hamming distance. Examples include: the dominance product, the threshold product and l_{2p+1} distances for constant p. Our results imply equivalence (up to poly log n factors) between complexity of computation of All Pairs: Hamming Distances, l_{2p+1} Distances, Dominance Products and Threshold Products. As a consequence, Yuster\u27s (SODA\u2709) algorithm improves not only Matousek\u27s (IPL\u2791), but also the results of Indyk, Lewenstein, Lipsky and Porat (ICALP\u2704) and Min, Kao and Zhu (COCOON\u2709). Furthermore, our reductions apply to the pattern matching setting, showing equivalence (up to poly log n factors) between pattern matching under Hamming Distance, l_{2p+1} Distance, Dominance Product and Threshold Product, with current best upperbounds due to results of Abrahamson (SICOMP\u2787), Amir and Farach (Ann. Math. Artif. Intell.\u2791), Atallah and Duket (IPL\u2711), Clifford, Clifford and Iliopoulous (CPM\u2705) and Amir, Lipsky, Porat and Umanski (CPM\u2705). The resulting algorithms for l_{2p+1} Pattern Matching and All Pairs l_{2p+1}, for 2p+1 = 3,5,7,... are new. Additionally, we show that the complexity of AllPairsHammingDistances (and thus of other aforementioned AllPairs- problems) is within poly log n from the time it takes to multiply matrices n x (n * d) and (n * d) x n, each with (n * d) non-zero entries. This means that the current upperbounds by Yuster (SODA\u2709) cannot be improved without improving the sparse matrix multiplication algorithm by Yuster and Zwick (ACM TALG\u2705) and vice versa

Dagstuhl Research Online Publication Server

Probabilistic Polynomials and Hamming Nearest Neighbors

Author: Alman Josh
Williams Ryan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/07/2015
Field of study

We show how to compute any symmetric Boolean function on

n

variables over any field (as well as the integers) with a probabilistic polynomial of degree

O(\sqrt{n \log(1/\epsilon)})

and error at most

\epsilon

. The degree dependence on

n

and

\epsilon

is optimal, matching a lower bound of Razborov (1987) and Smolensky (1987) for the MAJORITY function. The proof is constructive: a low-degree polynomial can be efficiently sampled from the distribution. This polynomial construction is combined with other algebraic ideas to give the first subquadratic time algorithm for computing a (worst-case) batch of Hamming distances in superlogarithmic dimensions, exactly. To illustrate, let

c(n) : \mathbb{N} \rightarrow \mathbb{N}

. Suppose we are given a database

D

n

vectors in

\{0,1\}^{c(n) \log n}

and a collection of

n

query vectors

Q

in the same dimension. For all

u \in Q

, we wish to compute a

v \in D

with minimum Hamming distance from

u

. We solve this problem in

n^{2-1/O(c(n) \log^2 c(n))}

randomized time. Hence, the problem is in "truly subquadratic" time for

O(\log n)

dimensions, and in subquadratic time for

d = o((\log^2 n)/(\log \log n)^2)

. We apply the algorithm to computing pairs with maximum inner product, closest pair in

\ell_1

for vectors with bounded integer entries, and pairs with maximum Jaccard coefficients.Comment: 16 pages. To appear in 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2015

arXiv.org e-Print Archive

Crossref

On the hardness of learning sparse parities

Author: Bhattacharyya Arnab
Gadekar Ameet
Ghoshal Suprovat
Saket Rishi
Publication venue
Publication date: 25/11/2015
Field of study

This work investigates the hardness of computing sparse solutions to systems of linear equations over F_2. Consider the k-EvenSet problem: given a homogeneous system of linear equations over F_2 on n variables, decide if there exists a nonzero solution of Hamming weight at most k (i.e. a k-sparse solution). While there is a simple O(n^{k/2})-time algorithm for it, establishing fixed parameter intractability for k-EvenSet has been a notorious open problem. Towards this goal, we show that unless k-Clique can be solved in n^{o(k)} time, k-EvenSet has no poly(n)2^{o(sqrt{k})} time algorithm and no polynomial time algorithm when k = (log n)^{2+eta} for any eta > 0. Our work also shows that the non-homogeneous generalization of the problem -- which we call k-VectorSum -- is W[1]-hard on instances where the number of equations is O(k log n), improving on previous reductions which produced Omega(n) equations. We also show that for any constant eps > 0, given a system of O(exp(O(k))log n) linear equations, it is W[1]-hard to decide if there is a k-sparse linear form satisfying all the equations or if every function on at most k-variables (k-junta) satisfies at most (1/2 + eps)-fraction of the equations. In the setting of computational learning, this shows hardness of approximate non-proper learning of k-parities. In a similar vein, we use the hardness of k-EvenSet to show that that for any constant d, unless k-Clique can be solved in n^{o(k)} time there is no poly(m, n)2^{o(sqrt{k}) time algorithm to decide whether a given set of m points in F_2^n satisfies: (i) there exists a non-trivial k-sparse homogeneous linear form evaluating to 0 on all the points, or (ii) any non-trivial degree d polynomial P supported on at most k variables evaluates to zero on approx. Pr_{F_2^n}[P(z) = 0] fraction of the points i.e., P is fooled by the set of points

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Distributed PCP Theorems for Hardness of Approximation in P

Author: Abboud Amir
Rubinstein Aviad
Williams Ryan
Publication venue
Publication date: 01/01/1952
Field of study

We present a new distributed model of probabilistically checkable proofs (PCP). A satisfying assignment

x \in \{0,1\}^n

to a CNF formula

\varphi

is shared between two parties, where Alice knows

x_1, \dots, x_{n/2}

, Bob knows

x_{n/2+1},\dots,x_n

, and both parties know

\varphi

. The goal is to have Alice and Bob jointly write a PCP that

x

satisfies

\varphi

, while exchanging little or no information. Unfortunately, this model as-is does not allow for nontrivial query complexity. Instead, we focus on a non-deterministic variant, where the players are helped by Merlin, a third party who knows all of

x

. Using our framework, we obtain, for the first time, PCP-like reductions from the Strong Exponential Time Hypothesis (SETH) to approximation problems in P. In particular, under SETH we show that there are no truly-subquadratic approximation algorithms for Bichromatic Maximum Inner Product over {0,1}-vectors, Bichromatic LCS Closest Pair over permutations, Approximate Regular Expression Matching, and Diameter in Product Metric. All our inapproximability factors are nearly-tight. In particular, for the first two problems we obtain nearly-polynomial factors of

2^{(\log n)^{1-o(1)}}

; only

(1+o(1))

-factor lower bounds (under SETH) were known before

arXiv.org e-Print Archive

Biblioteca Virtual del Patrimonio Bibliográfico (Virtual Library of Bibliographical Heritage)

Crossref