3,463 research outputs found

    Robust Proximity Search for Balls using Sublinear Space

    Get PDF
    Given a set of n disjoint balls b1, . . ., bn in IRd, we provide a data structure, of near linear size, that can answer (1 \pm \epsilon)-approximate kth-nearest neighbor queries in O(log n + 1/\epsilon^d) time, where k and \epsilon are provided at query time. If k and \epsilon are provided in advance, we provide a data structure to answer such queries, that requires (roughly) O(n/k) space; that is, the data structure has sublinear space requirement if k is sufficiently large

    Triangulating the Square and Squaring the Triangle: Quadtrees and Delaunay Triangulations are Equivalent

    Full text link
    We show that Delaunay triangulations and compressed quadtrees are equivalent structures. More precisely, we give two algorithms: the first computes a compressed quadtree for a planar point set, given the Delaunay triangulation; the second finds the Delaunay triangulation, given a compressed quadtree. Both algorithms run in deterministic linear time on a pointer machine. Our work builds on and extends previous results by Krznaric and Levcopolous and Buchin and Mulzer. Our main tool for the second algorithm is the well-separated pair decomposition(WSPD), a structure that has been used previously to find Euclidean minimum spanning trees in higher dimensions (Eppstein). We show that knowing the WSPD (and a quadtree) suffices to compute a planar Euclidean minimum spanning tree (EMST) in linear time. With the EMST at hand, we can find the Delaunay triangulation in linear time. As a corollary, we obtain deterministic versions of many previous algorithms related to Delaunay triangulations, such as splitting planar Delaunay triangulations, preprocessing imprecise points for faster Delaunay computation, and transdichotomous Delaunay triangulations.Comment: 37 pages, 13 figures, full version of a paper that appeared in SODA 201

    Efficient Computation of Multiple Density-Based Clustering Hierarchies

    Full text link
    HDBSCAN*, a state-of-the-art density-based hierarchical clustering method, produces a hierarchical organization of clusters in a dataset w.r.t. a parameter mpts. While the performance of HDBSCAN* is robust w.r.t. mpts in the sense that a small change in mpts typically leads to only a small or no change in the clustering structure, choosing a "good" mpts value can be challenging: depending on the data distribution, a high or low value for mpts may be more appropriate, and certain data clusters may reveal themselves at different values of mpts. To explore results for a range of mpts values, however, one has to run HDBSCAN* for each value in the range independently, which is computationally inefficient. In this paper, we propose an efficient approach to compute all HDBSCAN* hierarchies for a range of mpts values by replacing the graph used by HDBSCAN* with a much smaller graph that is guaranteed to contain the required information. An extensive experimental evaluation shows that with our approach one can obtain over one hundred hierarchies for the computational cost equivalent to running HDBSCAN* about 2 times.Comment: A short version of this paper appears at IEEE ICDM 2017. Corrected typos. Revised abstrac
    • …
    corecore