    Approximate Nearest Neighbor Search for Low Dimensional Queries

    We study the Approximate Nearest Neighbor problem for metric spaces where the query points are constrained to lie on a subspace of low doubling dimension, while the data is high-dimensional. We show that this problem can be solved efficiently despite the high dimensionality of the data.Comment: 25 page

    Down the Rabbit Hole: Robust Proximity Search and Density Estimation in Sublinear Space

    For a set of nn points in â„œd\Re^d, and parameters kk and \eps, we present a data structure that answers (1+\eps,k)-\ANN queries in logarithmic time. Surprisingly, the space used by the data-structure is \Otilde (n /k); that is, the space used is sublinear in the input size if kk is sufficiently large. Our approach provides a novel way to summarize geometric data, such that meaningful proximity queries on the data can be carried out using this sketch. Using this, we provide a sublinear space data-structure that can estimate the density of a point set under various measures, including: \begin{inparaenum}[(i)] \item sum of distances of kk closest points to the query point, and \item sum of squared distances of kk closest points to the query point. \end{inparaenum} Our approach generalizes to other distance based estimation of densities of similar flavor. We also study the problem of approximating some of these quantities when using sampling. In particular, we show that a sample of size \Otilde (n /k) is sufficient, in some restricted cases, to estimate the above quantities. Remarkably, the sample size has only linear dependency on the dimension

    A Matching Algorithm for Selecting Web Services Based on Non-Functional Features

    Searching for a Web service that meets the user requirements can be a complex task especially when the system starts to scale up by increasing the number of Web services, w, in the UDDI registry and by enlarging the number of QoS features (f) by which each Web service is described. This can be perceived as the commonly known nearest neighbor search problem, which typically imposes a time or storage complexity that is exponential in f. In this work, we present a new algorithm (wsSVD) that is founded on the algebraic matrix operation called Singular Value Decomposition (SVD). The basic idea is to encode the features of each Web service by a single value using the SVD. When a user seeks a Web service based on some specific requirements, these requirements get encoded by a single value using the same algorithm, and the matching process takes place in order to find the closest Web service that fulfills the user requirements. Our experiments show that the wsSVD algorithm performs and scales up well in comparison with other matching algorithms

    Computing the visibility map of fat objects

    AbstractWe give an output-sensitive algorithm for computing the visibility map of a set of n constant-complexity convex fat polyhedra or curved objects in 3-space. Our algorithm runs in O((n+k) polylog n) time, where k is the combinatorial complexity of the visibility map. This is the first algorithm for computing the visibility map of fat objects that does not require a depth order on the objects and is faster than the best known algorithm for general objects. It is also the first output-sensitive algorithm for curved objects that does not require a depth order

    Clustering Under Perturbation Stability in Near-Linear Time

    We consider the problem of center-based clustering in low-dimensional Euclidean spaces under the perturbation stability assumption. An instance is ?-stable if the underlying optimal clustering continues to remain optimal even when all pairwise distances are arbitrarily perturbed by a factor of at most ?. Our main contribution is in presenting efficient exact algorithms for ?-stable clustering instances whose running times depend near-linearly on the size of the data set when ? ? 2 + ?3. For k-center and k-means problems, our algorithms also achieve polynomial dependence on the number of clusters, k, when ? ? 2 + ?3 + ? for any constant ? > 0 in any fixed dimension. For k-median, our algorithms have polynomial dependence on k for ? > 5 in any fixed dimension; and for ? ? 2 + ?3 in two dimensions. Our algorithms are simple, and only require applying techniques such as local search or dynamic programming to a suitably modified metric space, combined with careful choice of data structures

    Lower Bounds for Intersection Reporting Among Flat Objects

    Bounded-Degree Polyhedronization of Point Sets

    Abstract In 1994 Grünbaum showed that, given a point set S in R 3 , it is always possible to construct a polyhedron whose vertices are exactly S. Such a polyhedron is called a polyhedronization of S. Agarwal et al. extended this work in 2008 by showing that there always exists a polyhedronization that can be decomposed into a union of tetrahedra (tetrahedralizable). In the same work they introduced the notion of a serpentine polyhedronization for which the dual of its tetrahedralization is a chain. In this work we present a randomized algorithm running in O(n log 6 n) expected time that constructs a serpentine polyhedronization that has vertices with degree at most 7, answering an open question by Agarwal et al