Search CORE

8,151 research outputs found

Probabilistic Polynomials and Hamming Nearest Neighbors

Author: Alman Josh
Williams Ryan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/07/2015
Field of study

We show how to compute any symmetric Boolean function on

n

variables over any field (as well as the integers) with a probabilistic polynomial of degree

O(\sqrt{n \log(1/\epsilon)})

and error at most

\epsilon

. The degree dependence on

n

and

\epsilon

is optimal, matching a lower bound of Razborov (1987) and Smolensky (1987) for the MAJORITY function. The proof is constructive: a low-degree polynomial can be efficiently sampled from the distribution. This polynomial construction is combined with other algebraic ideas to give the first subquadratic time algorithm for computing a (worst-case) batch of Hamming distances in superlogarithmic dimensions, exactly. To illustrate, let

c(n) : \mathbb{N} \rightarrow \mathbb{N}

. Suppose we are given a database

D

n

vectors in

\{0,1\}^{c(n) \log n}

and a collection of

n

query vectors

Q

in the same dimension. For all

u \in Q

, we wish to compute a

v \in D

with minimum Hamming distance from

u

. We solve this problem in

n^{2-1/O(c(n) \log^2 c(n))}

randomized time. Hence, the problem is in "truly subquadratic" time for

O(\log n)

dimensions, and in subquadratic time for

d = o((\log^2 n)/(\log \log n)^2)

. We apply the algorithm to computing pairs with maximum inner product, closest pair in

\ell_1

for vectors with bounded integer entries, and pairs with maximum Jaccard coefficients.Comment: 16 pages. To appear in 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2015

arXiv.org e-Print Archive

Crossref

Incremental and Decremental Maintenance of Planar Width

Author: Agarwal
Agarwal
Agarwal
Agarwal
Chan
Chazelle
David Eppstein
Eppstein
Eppstein
Houle
Janardan
Nievergelt
Overmars
Preparata
Rote
Schwarz
Toussaint
Publication venue: 'Elsevier BV'
Publication date: 03/05/2000
Field of study

We present an algorithm for maintaining the width of a planar point set dynamically, as points are inserted or deleted. Our algorithm takes time O(kn^epsilon) per update, where k is the amount of change the update causes in the convex hull, n is the number of points in the set, and epsilon is any arbitrarily small constant. For incremental or decremental update sequences, the amortized time per update is O(n^epsilon).Comment: 7 pages; 2 figures. A preliminary version of this paper was presented at the 10th ACM/SIAM Symp. Discrete Algorithms (SODA '99); this is the journal version, and will appear in J. Algorithm

arXiv.org e-Print Archive

Crossref

Spanners for Geometric Intersection Graphs

Author: Furer Martin
Kasiviswanathan Shiva Prasad
Publication venue
Publication date: 07/05/2006
Field of study

Efficient algorithms are presented for constructing spanners in geometric intersection graphs. For a unit ball graph in R^k, a (1+\epsilon)-spanner is obtained using efficient partitioning of the space into hypercubes and solving bichromatic closest pair problems. The spanner construction has almost equivalent complexity to the construction of Euclidean minimum spanning trees. The results are extended to arbitrary ball graphs with a sub-quadratic running time. For unit ball graphs, the spanners have a small separator decomposition which can be used to obtain efficient algorithms for approximating proximity problems like diameter and distance queries. The results on compressed quadtrees, geometric graph separators, and diameter approximation might be of independent interest.Comment: 16 pages, 5 figures, Late

arXiv.org e-Print Archive

CiteSeerX

A simple online competitive adaptation of Lempel-Ziv compression with efficient random access support

Author: Dutta Akashnil
Levi Reut
Ron Dana
Rubinfeld Ronitt
Publication venue
Publication date: 11/01/2013
Field of study

We present a simple adaptation of the Lempel Ziv 78' (LZ78) compression scheme ({\em IEEE Transactions on Information Theory, 1978}) that supports efficient random access to the input string. Namely, given query access to the compressed string, it is possible to efficiently recover any symbol of the input string. The compression algorithm is given as input a parameter \eps >0, and with very high probability increases the length of the compressed string by at most a factor of (1+\eps). The access time is O(\log n + 1/\eps^2) in expectation, and O(\log n/\eps^2) with high probability. The scheme relies on sparse transitive-closure spanners. Any (consecutive) substring of the input string can be retrieved at an additional additive cost in the running time of the length of the substring. We also formally establish the necessity of modifying LZ78 so as to allow efficient random access. Specifically, we construct a family of strings for which

\Omega(n/\log n)

queries to the LZ78-compressed string are required in order to recover a single symbol in the input string. The main benefit of the proposed scheme is that it preserves the online nature and simplicity of LZ78, and that for {\em every} input string, the length of the compressed string is only a small factor larger than that obtained by running LZ78

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Convex Hull of Points Lying on Lines in o(n log n) Time after Preprocessing

Author: Afshani
Ali Abam
Arora
Basch
Ben-Or
Bern
Buchin
Chan
Chan
Chan
Chazelle
Chazelle
Chin
Chvátal
Clarkson
Cole
Cormen
de Berg
Devillers
Devillers
Dey
Djidjev
Esther Ezra
Everett
Guibas
Held
Hoeffding
Khuller
Kirkpatrick
Klein
Löffler
Löffler
Löffler
Matoušek
McCallum
Preparata
Ramos
Seidel
Sharir
van Kreveld
Wolfgang Mulzer
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

Motivated by the desire to cope with data imprecision, we study methods for taking advantage of preliminary information about point sets in order to speed up the computation of certain structures associated with them. In particular, we study the following problem: given a set L of n lines in the plane, we wish to preprocess L such that later, upon receiving a set P of n points, each of which lies on a distinct line of L, we can construct the convex hull of P efficiently. We show that in quadratic time and space it is possible to construct a data structure on L that enables us to compute the convex hull of any such point set P in O(n alpha(n) log* n) expected time. If we further assume that the points are "oblivious" with respect to the data structure, the running time improves to O(n alpha(n)). The analysis applies almost verbatim when L is a set of line-segments, and yields similar asymptotic bounds. We present several extensions, including a trade-off between space and query time and an output-sensitive algorithm. We also study the "dual problem" where we show how to efficiently compute the (<= k)-level of n lines in the plane, each of which lies on a distinct point (given in advance). We complement our results by Omega(n log n) lower bounds under the algebraic computation tree model for several related problems, including sorting a set of points (according to, say, their x-order), each of which lies on a given line known in advance. Therefore, the convex hull problem under our setting is easier than sorting, contrary to the "standard" convex hull and sorting problems, in which the two problems require Theta(n log n) steps in the worst case (under the algebraic computation tree model).Comment: 26 pages, 5 figures, 1 appendix; a preliminary version appeared at SoCG 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

On trip planning queries in spatial databases

Author: Li Feifei
Cheng Dihan
Publication venue: Boston University Computer Science Department
Publication date: 01/01/1997
Field of study

In this paper we discuss a new type of query in Spatial Databases, called Trip Planning Query (TPQ). Given a set of points P in space, where each point belongs to a category, and given two points s and e, TPQ asks for the best trip that starts at s, passes through exactly one point from each category, and ends at e. An example of a TPQ is when a user wants to visit a set of different places and at the same time minimize the total travelling cost, e.g. what is the shortest travelling plan for me to visit an automobile shop, a CVS pharmacy outlet, and a Best Buy shop along my trip from A to B? The trip planning query is an extension of the well-known TSP problem and therefore is NP-hard. The difficulty of this query lies in the existence of multiple choices for each category. In this paper, we first study fast approximation algorithms for the trip planning query in a metric space, assuming that the data set fits in main memory, and give the theory analysis of their approximation bounds. Then, the trip planning query is examined for data sets that do not fit in main memory and must be stored on disk. For the disk-resident data, we consider two cases. In one case, we assume that the points are located in Euclidean space and indexed with an Rtree. In the other case, we consider the problem of points that lie on the edges of a spatial network (e.g. road network) and the distance between two points is defined using the shortest distance over the network. Finally, we give an experimental evaluation of the proposed algorithms using synthetic data sets generated on real road networks

Boston University Institutional Repository (OpenBU)