Search CORE

34 research outputs found

Probabilistic Polynomials and Hamming Nearest Neighbors

Author: Alman Josh
Williams Ryan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/07/2015
Field of study

We show how to compute any symmetric Boolean function on

n

variables over any field (as well as the integers) with a probabilistic polynomial of degree

O(\sqrt{n \log(1/\epsilon)})

and error at most

\epsilon

. The degree dependence on

n

and

\epsilon

is optimal, matching a lower bound of Razborov (1987) and Smolensky (1987) for the MAJORITY function. The proof is constructive: a low-degree polynomial can be efficiently sampled from the distribution. This polynomial construction is combined with other algebraic ideas to give the first subquadratic time algorithm for computing a (worst-case) batch of Hamming distances in superlogarithmic dimensions, exactly. To illustrate, let

c(n) : \mathbb{N} \rightarrow \mathbb{N}

. Suppose we are given a database

D

n

vectors in

\{0,1\}^{c(n) \log n}

and a collection of

n

query vectors

Q

in the same dimension. For all

u \in Q

, we wish to compute a

v \in D

with minimum Hamming distance from

u

. We solve this problem in

n^{2-1/O(c(n) \log^2 c(n))}

randomized time. Hence, the problem is in "truly subquadratic" time for

O(\log n)

dimensions, and in subquadratic time for

d = o((\log^2 n)/(\log \log n)^2)

. We apply the algorithm to computing pairs with maximum inner product, closest pair in

\ell_1

for vectors with bounded integer entries, and pairs with maximum Jaccard coefficients.Comment: 16 pages. To appear in 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2015

arXiv.org e-Print Archive

Crossref

Hybrid LSH: Faster Near Neighbors Reporting in High-dimensional Space

Author: Pham Ninh
Publication venue
Publication date: 01/01/2017
Field of study

We study the

r

-near neighbors reporting problem (

r

-NN), i.e., reporting \emph{all} points in a high-dimensional point set

S

that lie within a radius

r

of a given query point

q

. Our approach builds upon on the locality-sensitive hashing (LSH) framework due to its appealing asymptotic sublinear query time for near neighbor search problems in high-dimensional space. A bottleneck of the traditional LSH scheme for solving

r

-NN is that its performance is sensitive to data and query-dependent parameters. On datasets whose data distributions have diverse local density patterns, LSH with inappropriate tuning parameters can sometimes be outperformed by a simple linear search. In this paper, we introduce a hybrid search strategy between LSH-based search and linear search for

r

-NN in high-dimensional space. By integrating an auxiliary data structure into LSH hash tables, we can efficiently estimate the computational cost of LSH-based search for a given query regardless of the data distribution. This means that we are able to choose the appropriate search strategy between LSH-based search and linear search to achieve better performance. Moreover, the integrated data structure is time efficient and fits well with many recent state-of-the-art LSH-based approaches. Our experiments on real-world datasets show that the hybrid search approach outperforms (or is comparable to) both LSH-based search and linear search for a wide range of search radii and data distributions in high-dimensional space.Comment: Accepted as a short paper in EDBT 201

arXiv.org e-Print Archive

Copenhagen University Research Information System

Fine-Grained Complexity Theory: Conditional Lower Bounds for Computational Geometry

Author: Bringmann K.
Publication venue
Publication date: 01/01/2021
Field of study

Fine-grained complexity theory is the area of theoretical computer sciencethat proves conditional lower bounds based on the Strong Exponential TimeHypothesis and similar conjectures. This area has been thriving in the lastdecade, leading to conditionally best-possible algorithms for a wide variety ofproblems on graphs, strings, numbers etc. This article is an introduction tofine-grained lower bounds in computational geometry, with a focus on lowerbounds for polynomial-time problems based on the Orthogonal Vectors Hypothesis.Specifically, we discuss conditional lower bounds for nearest neighbor searchunder the Euclidean distance and Fr\'echet distance.<br

MPG.PuRe

Strong ETH Breaks With Merlin and Arthur: Short Non-Interactive Proofs of Batch Evaluation

Author: Williams Ryan
Publication venue
Publication date: 01/01/2016
Field of study

We present an efficient proof system for Multipoint Arithmetic Circuit Evaluation: for every arithmetic circuit

C(x_1,\ldots,x_n)

of size

s

and degree

d

over a field

{\mathbb F}

, and any inputs

a_1,\ldots,a_K \in {\mathbb F}^n

\bullet

the Prover sends the Verifier the values

C(a_1), \ldots, C(a_K) \in {\mathbb F}

and a proof of

\tilde{O}(K \cdot d)

length, and

\bullet

the Verifier tosses

\textrm{poly}(\log(dK|{\mathbb F}|/\varepsilon))

coins and can check the proof in about

\tilde{O}(K \cdot(n + d) + s)

time, with probability of error less than

\varepsilon

. For small degree

d

, this "Merlin-Arthur" proof system (a.k.a. MA-proof system) runs in nearly-linear time, and has many applications. For example, we obtain MA-proof systems that run in

c^{n}

time (for various

c < 2

) for the Permanent,

\#

Circuit-SAT for all sublinear-depth circuits, counting Hamiltonian cycles, and infeasibility of

0

1

linear programs. In general, the value of any polynomial in Valiant's class

{\sf VP}

can be certified faster than "exhaustive summation" over all possible assignments. These results strongly refute a Merlin-Arthur Strong ETH and Arthur-Merlin Strong ETH posed by Russell Impagliazzo and others. We also give a three-round (AMA) proof system for quantified Boolean formulas running in

2^{2n/3+o(n)}

time, nearly-linear time MA-proof systems for counting orthogonal vectors in a collection and finding Closest Pairs in the Hamming metric, and a MA-proof system running in

n^{k/2+O(1)}

-time for counting

k

-cliques in graphs. We point to some potential future directions for refuting the Nondeterministic Strong ETH.Comment: 17 page

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Distributed PCP Theorems for Hardness of Approximation in P

Author: Abboud Amir
Rubinstein Aviad
Williams Ryan
Publication venue
Publication date: 01/01/1952
Field of study

We present a new distributed model of probabilistically checkable proofs (PCP). A satisfying assignment

x \in \{0,1\}^n

to a CNF formula

\varphi

is shared between two parties, where Alice knows

x_1, \dots, x_{n/2}

, Bob knows

x_{n/2+1},\dots,x_n

, and both parties know

\varphi

. The goal is to have Alice and Bob jointly write a PCP that

x

satisfies

\varphi

, while exchanging little or no information. Unfortunately, this model as-is does not allow for nontrivial query complexity. Instead, we focus on a non-deterministic variant, where the players are helped by Merlin, a third party who knows all of

x

. Using our framework, we obtain, for the first time, PCP-like reductions from the Strong Exponential Time Hypothesis (SETH) to approximation problems in P. In particular, under SETH we show that there are no truly-subquadratic approximation algorithms for Bichromatic Maximum Inner Product over {0,1}-vectors, Bichromatic LCS Closest Pair over permutations, Approximate Regular Expression Matching, and Diameter in Product Metric. All our inapproximability factors are nearly-tight. In particular, for the first two problems we obtain nearly-polynomial factors of

2^{(\log n)^{1-o(1)}}

; only

(1+o(1))

-factor lower bounds (under SETH) were known before

arXiv.org e-Print Archive

Biblioteca Virtual del Patrimonio Bibliográfico (Virtual Library of Bibliographical Heritage)

Crossref