8,151 research outputs found
Probabilistic Polynomials and Hamming Nearest Neighbors
We show how to compute any symmetric Boolean function on variables over
any field (as well as the integers) with a probabilistic polynomial of degree
and error at most . The degree
dependence on and is optimal, matching a lower bound of Razborov
(1987) and Smolensky (1987) for the MAJORITY function. The proof is
constructive: a low-degree polynomial can be efficiently sampled from the
distribution.
This polynomial construction is combined with other algebraic ideas to give
the first subquadratic time algorithm for computing a (worst-case) batch of
Hamming distances in superlogarithmic dimensions, exactly. To illustrate, let
. Suppose we are given a database
of vectors in and a collection of query vectors
in the same dimension. For all , we wish to compute a
with minimum Hamming distance from . We solve this problem in randomized time. Hence, the problem is in "truly subquadratic"
time for dimensions, and in subquadratic time for . We apply the algorithm to computing pairs with maximum
inner product, closest pair in for vectors with bounded integer
entries, and pairs with maximum Jaccard coefficients.Comment: 16 pages. To appear in 56th Annual IEEE Symposium on Foundations of
Computer Science (FOCS 2015
Incremental and Decremental Maintenance of Planar Width
We present an algorithm for maintaining the width of a planar point set
dynamically, as points are inserted or deleted. Our algorithm takes time
O(kn^epsilon) per update, where k is the amount of change the update causes in
the convex hull, n is the number of points in the set, and epsilon is any
arbitrarily small constant. For incremental or decremental update sequences,
the amortized time per update is O(n^epsilon).Comment: 7 pages; 2 figures. A preliminary version of this paper was presented
at the 10th ACM/SIAM Symp. Discrete Algorithms (SODA '99); this is the
journal version, and will appear in J. Algorithm
Spanners for Geometric Intersection Graphs
Efficient algorithms are presented for constructing spanners in geometric
intersection graphs. For a unit ball graph in R^k, a (1+\epsilon)-spanner is
obtained using efficient partitioning of the space into hypercubes and solving
bichromatic closest pair problems. The spanner construction has almost
equivalent complexity to the construction of Euclidean minimum spanning trees.
The results are extended to arbitrary ball graphs with a sub-quadratic running
time.
For unit ball graphs, the spanners have a small separator decomposition which
can be used to obtain efficient algorithms for approximating proximity problems
like diameter and distance queries. The results on compressed quadtrees,
geometric graph separators, and diameter approximation might be of independent
interest.Comment: 16 pages, 5 figures, Late
A simple online competitive adaptation of Lempel-Ziv compression with efficient random access support
We present a simple adaptation of the Lempel Ziv 78' (LZ78) compression
scheme ({\em IEEE Transactions on Information Theory, 1978}) that supports
efficient random access to the input string. Namely, given query access to the
compressed string, it is possible to efficiently recover any symbol of the
input string. The compression algorithm is given as input a parameter \eps
>0, and with very high probability increases the length of the compressed
string by at most a factor of (1+\eps). The access time is O(\log n +
1/\eps^2) in expectation, and O(\log n/\eps^2) with high probability. The
scheme relies on sparse transitive-closure spanners. Any (consecutive)
substring of the input string can be retrieved at an additional additive cost
in the running time of the length of the substring. We also formally establish
the necessity of modifying LZ78 so as to allow efficient random access.
Specifically, we construct a family of strings for which
queries to the LZ78-compressed string are required in order to recover a single
symbol in the input string. The main benefit of the proposed scheme is that it
preserves the online nature and simplicity of LZ78, and that for {\em every}
input string, the length of the compressed string is only a small factor larger
than that obtained by running LZ78
Convex Hull of Points Lying on Lines in o(n log n) Time after Preprocessing
Motivated by the desire to cope with data imprecision, we study methods for
taking advantage of preliminary information about point sets in order to speed
up the computation of certain structures associated with them.
In particular, we study the following problem: given a set L of n lines in
the plane, we wish to preprocess L such that later, upon receiving a set P of n
points, each of which lies on a distinct line of L, we can construct the convex
hull of P efficiently. We show that in quadratic time and space it is possible
to construct a data structure on L that enables us to compute the convex hull
of any such point set P in O(n alpha(n) log* n) expected time. If we further
assume that the points are "oblivious" with respect to the data structure, the
running time improves to O(n alpha(n)). The analysis applies almost verbatim
when L is a set of line-segments, and yields similar asymptotic bounds. We
present several extensions, including a trade-off between space and query time
and an output-sensitive algorithm. We also study the "dual problem" where we
show how to efficiently compute the (<= k)-level of n lines in the plane, each
of which lies on a distinct point (given in advance).
We complement our results by Omega(n log n) lower bounds under the algebraic
computation tree model for several related problems, including sorting a set of
points (according to, say, their x-order), each of which lies on a given line
known in advance. Therefore, the convex hull problem under our setting is
easier than sorting, contrary to the "standard" convex hull and sorting
problems, in which the two problems require Theta(n log n) steps in the worst
case (under the algebraic computation tree model).Comment: 26 pages, 5 figures, 1 appendix; a preliminary version appeared at
SoCG 201
On trip planning queries in spatial databases
In this paper we discuss a new type of query in Spatial Databases, called Trip Planning Query (TPQ). Given a set of points P in space, where each point belongs to a category, and given two points s and e, TPQ asks for the best trip that starts at s, passes through exactly one point from each category, and ends at e. An example of a TPQ is when a user wants to visit a set of different places and at the same time minimize the total travelling cost, e.g. what is the shortest travelling plan for me to visit an automobile shop, a CVS pharmacy outlet, and a Best Buy shop along my trip from A to B? The trip planning query is an extension of the well-known TSP problem and therefore is NP-hard. The difficulty of this query lies in the existence of multiple choices for each category. In this paper, we first study fast approximation algorithms for the trip planning query in a metric space, assuming that the data set fits in main memory, and give the theory analysis of their approximation bounds. Then, the trip planning query is examined for data sets that do not fit in main memory and must be stored on disk. For the disk-resident data, we consider two cases. In one case, we assume that the points are located in Euclidean space and indexed with an Rtree. In the other case, we consider the problem of points that lie on the edges of a spatial network (e.g. road network) and the distance between two points is defined using the shortest distance over the network. Finally, we give an experimental evaluation of the proposed algorithms using synthetic data sets generated on real road networks
- …