9,594 research outputs found

    Optimal randomized incremental construction for guaranteed logarithmic planar point location

    Full text link
    Given a planar map of nn segments in which we wish to efficiently locate points, we present the first randomized incremental construction of the well-known trapezoidal-map search-structure that only requires expected O(nlogn)O(n \log n) preprocessing time while deterministically guaranteeing worst-case linear storage space and worst-case logarithmic query time. This settles a long standing open problem; the best previously known construction time of such a structure, which is based on a directed acyclic graph, so-called the history DAG, and with the above worst-case space and query-time guarantees, was expected O(nlog2n)O(n \log^2 n). The result is based on a deeper understanding of the structure of the history DAG, its depth in relation to the length of its longest search path, as well as its correspondence to the trapezoidal search tree. Our results immediately extend to planar maps induced by finite collections of pairwise interior disjoint well-behaved curves.Comment: The article significantly extends the theoretical aspects of the work presented in http://arxiv.org/abs/1205.543

    Maximum Inner-Product Search using Tree Data-structures

    Full text link
    The problem of {\em efficiently} finding the best match for a query in a given set with respect to the Euclidean distance or the cosine similarity has been extensively studied in literature. However, a closely related problem of efficiently finding the best match with respect to the inner product has never been explored in the general setting to the best of our knowledge. In this paper we consider this general problem and contrast it with the existing best-match algorithms. First, we propose a general branch-and-bound algorithm using a tree data structure. Subsequently, we present a dual-tree algorithm for the case where there are multiple queries. Finally we present a new data structure for increasing the efficiency of the dual-tree algorithm. These branch-and-bound algorithms involve novel bounds suited for the purpose of best-matching with inner products. We evaluate our proposed algorithms on a variety of data sets from various applications, and exhibit up to five orders of magnitude improvement in query time over the naive search technique.Comment: Under submission in KDD 201

    Improved Implementation of Point Location in General Two-Dimensional Subdivisions

    Full text link
    We present a major revamp of the point-location data structure for general two-dimensional subdivisions via randomized incremental construction, implemented in CGAL, the Computational Geometry Algorithms Library. We can now guarantee that the constructed directed acyclic graph G is of linear size and provides logarithmic query time. Via the construction of the Voronoi diagram for a given point set S of size n, this also enables nearest-neighbor queries in guaranteed O(log n) time. Another major innovation is the support of general unbounded subdivisions as well as subdivisions of two-dimensional parametric surfaces such as spheres, tori, cylinders. The implementation is exact, complete, and general, i.e., it can also handle non-linear subdivisions. Like the previous version, the data structure supports modifications of the subdivision, such as insertions and deletions of edges, after the initial preprocessing. A major challenge is to retain the expected O(n log n) preprocessing time while providing the above (deterministic) space and query-time guarantees. We describe an efficient preprocessing algorithm, which explicitly verifies the length L of the longest query path in O(n log n) time. However, instead of using L, our implementation is based on the depth D of G. Although we prove that the worst case ratio of D and L is Theta(n/log n), we conjecture, based on our experimental results, that this solution achieves expected O(n log n) preprocessing time.Comment: 21 page

    Angle Tree: Nearest Neighbor Search in High Dimensions with Low Intrinsic Dimensionality

    Full text link
    We propose an extension of tree-based space-partitioning indexing structures for data with low intrinsic dimensionality embedded in a high dimensional space. We call this extension an Angle Tree. Our extension can be applied to both classical kd-trees as well as the more recent rp-trees. The key idea of our approach is to store the angle (the "dihedral angle") between the data region (which is a low dimensional manifold) and the random hyperplane that splits the region (the "splitter"). We show that the dihedral angle can be used to obtain a tight lower bound on the distance between the query point and any point on the opposite side of the splitter. This in turn can be used to efficiently prune the search space. We introduce a novel randomized strategy to efficiently calculate the dihedral angle with a high degree of accuracy. Experiments and analysis on real and synthetic data sets shows that the Angle Tree is the most efficient known indexing structure for nearest neighbor queries in terms of preprocessing and space usage while achieving high accuracy and fast search time.Comment: To be submitted to IEEE Transactions on Pattern Analysis and Machine Intelligenc
    corecore