3,463 research outputs found
Robust Proximity Search for Balls using Sublinear Space
Given a set of n disjoint balls b1, . . ., bn in IRd, we provide a data
structure, of near linear size, that can answer (1 \pm \epsilon)-approximate
kth-nearest neighbor queries in O(log n + 1/\epsilon^d) time, where k and
\epsilon are provided at query time. If k and \epsilon are provided in advance,
we provide a data structure to answer such queries, that requires (roughly)
O(n/k) space; that is, the data structure has sublinear space requirement if k
is sufficiently large
Triangulating the Square and Squaring the Triangle: Quadtrees and Delaunay Triangulations are Equivalent
We show that Delaunay triangulations and compressed quadtrees are equivalent
structures. More precisely, we give two algorithms: the first computes a
compressed quadtree for a planar point set, given the Delaunay triangulation;
the second finds the Delaunay triangulation, given a compressed quadtree. Both
algorithms run in deterministic linear time on a pointer machine. Our work
builds on and extends previous results by Krznaric and Levcopolous and Buchin
and Mulzer. Our main tool for the second algorithm is the well-separated pair
decomposition(WSPD), a structure that has been used previously to find
Euclidean minimum spanning trees in higher dimensions (Eppstein). We show that
knowing the WSPD (and a quadtree) suffices to compute a planar Euclidean
minimum spanning tree (EMST) in linear time. With the EMST at hand, we can find
the Delaunay triangulation in linear time.
As a corollary, we obtain deterministic versions of many previous algorithms
related to Delaunay triangulations, such as splitting planar Delaunay
triangulations, preprocessing imprecise points for faster Delaunay computation,
and transdichotomous Delaunay triangulations.Comment: 37 pages, 13 figures, full version of a paper that appeared in SODA
201
Efficient Computation of Multiple Density-Based Clustering Hierarchies
HDBSCAN*, a state-of-the-art density-based hierarchical clustering method,
produces a hierarchical organization of clusters in a dataset w.r.t. a
parameter mpts. While the performance of HDBSCAN* is robust w.r.t. mpts in the
sense that a small change in mpts typically leads to only a small or no change
in the clustering structure, choosing a "good" mpts value can be challenging:
depending on the data distribution, a high or low value for mpts may be more
appropriate, and certain data clusters may reveal themselves at different
values of mpts. To explore results for a range of mpts values, however, one has
to run HDBSCAN* for each value in the range independently, which is
computationally inefficient. In this paper, we propose an efficient approach to
compute all HDBSCAN* hierarchies for a range of mpts values by replacing the
graph used by HDBSCAN* with a much smaller graph that is guaranteed to contain
the required information. An extensive experimental evaluation shows that with
our approach one can obtain over one hundred hierarchies for the computational
cost equivalent to running HDBSCAN* about 2 times.Comment: A short version of this paper appears at IEEE ICDM 2017. Corrected
typos. Revised abstrac
- …