Search CORE

899 research outputs found

Distributed PCP Theorems for Hardness of Approximation in P

Author: Abboud Amir
Rubinstein Aviad
Williams Ryan
Publication venue
Publication date: 01/01/1952
Field of study

We present a new distributed model of probabilistically checkable proofs (PCP). A satisfying assignment

x \in \{0,1\}^n

to a CNF formula

\varphi

is shared between two parties, where Alice knows

x_1, \dots, x_{n/2}

, Bob knows

x_{n/2+1},\dots,x_n

, and both parties know

\varphi

. The goal is to have Alice and Bob jointly write a PCP that

x

satisfies

\varphi

, while exchanging little or no information. Unfortunately, this model as-is does not allow for nontrivial query complexity. Instead, we focus on a non-deterministic variant, where the players are helped by Merlin, a third party who knows all of

x

. Using our framework, we obtain, for the first time, PCP-like reductions from the Strong Exponential Time Hypothesis (SETH) to approximation problems in P. In particular, under SETH we show that there are no truly-subquadratic approximation algorithms for Bichromatic Maximum Inner Product over {0,1}-vectors, Bichromatic LCS Closest Pair over permutations, Approximate Regular Expression Matching, and Diameter in Product Metric. All our inapproximability factors are nearly-tight. In particular, for the first two problems we obtain nearly-polynomial factors of

2^{(\log n)^{1-o(1)}}

; only

(1+o(1))

-factor lower bounds (under SETH) were known before

arXiv.org e-Print Archive

Biblioteca Virtual del Patrimonio Bibliográfico (Virtual Library of Bibliographical Heritage)

Crossref

High-dimensional approximate nearest neighbor: k-d Generalized Randomized Forests

Author: Avrithis Yannis
Emiris Ioannis Z.
Samaras Georgios
Publication venue
Publication date: 01/03/2016
Field of study

We propose a new data-structure, the generalized randomized kd forest, or kgeraf, for approximate nearest neighbor searching in high dimensions. In particular, we introduce new randomization techniques to specify a set of independently constructed trees where search is performed simultaneously, hence increasing accuracy. We omit backtracking, and we optimize distance computations, thus accelerating queries. We release public domain software geraf and we compare it to existing implementations of state-of-the-art methods including BBD-trees, Locality Sensitive Hashing, randomized kd forests, and product quantization. Experimental results indicate that our method would be the method of choice in dimensions around 1,000, and probably up to 10,000, and pointsets of cardinality up to a few hundred thousands or even one million; this range of inputs is encountered in many critical applications today. For instance, we handle a real dataset of

10^6

images represented in 960 dimensions with a query time of less than

1

sec on average and 90\% responses being true nearest neighbors

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

String Indexing with Compressed Patterns

Author: Bille Philip
Steiner Teresa Anna
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 37th International Symposium on Theoretical Aspects of Computer Science (STACS 2020)
Publication date: 01/01/2020
Field of study

Given a string S of length n, the classic string indexing problem is to preprocess S into a compact data structure that supports efficient subsequent pattern queries. In this paper we consider the basic variant where the pattern is given in compressed form and the goal is to achieve query time that is fast in terms of the compressed size of the pattern. This captures the common client-server scenario, where a client submits a query and communicates it in compressed form to a server. Instead of the server decompressing the query before processing it, we consider how to efficiently process the compressed query directly. Our main result is a novel linear space data structure that achieves near-optimal query time for patterns compressed with the classic Lempel-Ziv 1977 (LZ77) compression scheme. Along the way we develop several data structural techniques of independent interest, including a novel data structure that compactly encodes all LZ77 compressed suffixes of a string in linear space and a general decomposition of tries that reduces the search time from logarithmic in the size of the trie to logarithmic in the length of the pattern

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Online Research Database In Technology

Neural Distributed Autoassociative Memories: A Survey

Author: Frolov A. A.
Gayler R.
Gritsenko V. I.
Kleyko D.
Osipov E.
Rachkovskij D. A.
Publication venue: 'National Academy of Sciences of Ukraine (Co. LTD Ukrinformnauka)'
Publication date: 01/01/2017
Field of study

Introduction. Neural network models of autoassociative, distributed memory allow storage and retrieval of many items (vectors) where the number of stored items can exceed the vector dimension (the number of neurons in the network). This opens the possibility of a sublinear time search (in the number of stored items) for approximate nearest neighbors among vectors of high dimension. The purpose of this paper is to review models of autoassociative, distributed memory that can be naturally implemented by neural networks (mainly with local learning rules and iterative dynamics based on information locally available to neurons). Scope. The survey is focused mainly on the networks of Hopfield, Willshaw and Potts, that have connections between pairs of neurons and operate on sparse binary vectors. We discuss not only autoassociative memory, but also the generalization properties of these networks. We also consider neural networks with higher-order connections and networks with a bipartite graph structure for non-binary data with linear constraints. Conclusions. In conclusion we discuss the relations to similarity search, advantages and drawbacks of these techniques, and topics for further research. An interesting and still not completely resolved question is whether neural autoassociative memories can search for approximate nearest neighbors faster than other index structures for similarity search, in particular for the case of very high dimensional vectors.Comment: 31 page

arXiv.org e-Print Archive

Наукова електронна бібліотека періодичних видань НАН України (Vernadsky National Library of Ukraine)

More Dynamic Data Structures for Geometric Set Cover with Sublinear Update Time

Author: Chan Timothy M.
He Qizheng
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 37th International Symposium on Computational Geometry (SoCG 2021)
Publication date: 01/01/2021
Field of study

We study geometric set cover problems in dynamic settings, allowing insertions and deletions of points and objects. We present the first dynamic data structure that can maintain an O(1)-approximation in sublinear update time for set cover for axis-aligned squares in 2D . More precisely, we obtain randomized update time O(n^{2/3+?}) for an arbitrarily small constant ? > 0. Previously, a dynamic geometric set cover data structure with sublinear update time was known only for unit squares by Agarwal, Chang, Suri, Xiao, and Xue [SoCG 2020]. If only an approximate size of the solution is needed, then we can also obtain sublinear amortized update time for disks in 2D and halfspaces in 3D . As a byproduct, our techniques for dynamic set cover also yield an optimal randomized O(nlog n)-time algorithm for static set cover for 2D disks and 3D halfspaces, improving our earlier O(nlog n(log log n)^{O(1)}) result [SoCG 2020]

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

SANNS: Scaling Up Secure Approximate k-Nearest Neighbors Search

Author: Chen Hao
Chillotti Ilaria
Dong Yihe
Poburinnaya Oxana
Razenshteyn Ilya
Riazi M. Sadegh
Publication venue
Publication date: 08/03/2020
Field of study

The

k

-Nearest Neighbor Search (

k

-NNS) is the backbone of several cloud-based services such as recommender systems, face recognition, and database search on text and images. In these services, the client sends the query to the cloud server and receives the response in which case the query and response are revealed to the service provider. Such data disclosures are unacceptable in several scenarios due to the sensitivity of data and/or privacy laws. In this paper, we introduce SANNS, a system for secure

k

-NNS that keeps client's query and the search result confidential. SANNS comprises two protocols: an optimized linear scan and a protocol based on a novel sublinear time clustering-based algorithm. We prove the security of both protocols in the standard semi-honest model. The protocols are built upon several state-of-the-art cryptographic primitives such as lattice-based additively homomorphic encryption, distributed oblivious RAM, and garbled circuits. We provide several contributions to each of these primitives which are applicable to other secure computation tasks. Both of our protocols rely on a new circuit for the approximate top-

k

selection from

n

numbers that is built from

O(n + k^2)

comparators. We have implemented our proposed system and performed extensive experimental results on four datasets in two different computation environments, demonstrating more than

18-31\times

faster response time compared to optimally implemented protocols from the prior work. Moreover, SANNS is the first work that scales to the database of 10 million entries, pushing the limit by more than two orders of magnitude.Comment: 18 pages, to appear at USENIX Security Symposium 202

arXiv.org e-Print Archive

Cryptology ePrint Archive