1,105 research outputs found
Squarepants in a Tree: Sum of Subtree Clustering and Hyperbolic Pants Decomposition
We provide efficient constant factor approximation algorithms for the
problems of finding a hierarchical clustering of a point set in any metric
space, minimizing the sum of minimimum spanning tree lengths within each
cluster, and in the hyperbolic or Euclidean planes, minimizing the sum of
cluster perimeters. Our algorithms for the hyperbolic and Euclidean planes can
also be used to provide a pants decomposition, that is, a set of disjoint
simple closed curves partitioning the plane minus the input points into subsets
with exactly three boundary components, with approximately minimum total
length. In the Euclidean case, these curves are squares; in the hyperbolic
case, they combine our Euclidean square pants decomposition with our tree
clustering method for general metric spaces.Comment: 22 pages, 14 figures. This version replaces the proof of what is now
Lemma 5.2, as the previous proof was erroneou
Triangulating the Square and Squaring the Triangle: Quadtrees and Delaunay Triangulations are Equivalent
We show that Delaunay triangulations and compressed quadtrees are equivalent
structures. More precisely, we give two algorithms: the first computes a
compressed quadtree for a planar point set, given the Delaunay triangulation;
the second finds the Delaunay triangulation, given a compressed quadtree. Both
algorithms run in deterministic linear time on a pointer machine. Our work
builds on and extends previous results by Krznaric and Levcopolous and Buchin
and Mulzer. Our main tool for the second algorithm is the well-separated pair
decomposition(WSPD), a structure that has been used previously to find
Euclidean minimum spanning trees in higher dimensions (Eppstein). We show that
knowing the WSPD (and a quadtree) suffices to compute a planar Euclidean
minimum spanning tree (EMST) in linear time. With the EMST at hand, we can find
the Delaunay triangulation in linear time.
As a corollary, we obtain deterministic versions of many previous algorithms
related to Delaunay triangulations, such as splitting planar Delaunay
triangulations, preprocessing imprecise points for faster Delaunay computation,
and transdichotomous Delaunay triangulations.Comment: 37 pages, 13 figures, full version of a paper that appeared in SODA
201
A limit process for partial match queries in random quadtrees and -d trees
We consider the problem of recovering items matching a partially specified
pattern in multidimensional trees (quadtrees and -d trees). We assume the
traditional model where the data consist of independent and uniform points in
the unit square. For this model, in a structure on points, it is known that
the number of nodes to visit in order to report the items matching
a random query , independent and uniformly distributed on ,
satisfies , where and
are explicit constants. We develop an approach based on the analysis of
the cost of any fixed query , and give precise estimates
for the variance and limit distribution of the cost . Our results
permit us to describe a limit process for the costs as varies in
; one of the consequences is that ; this settles a question of
Devroye [Pers. Comm., 2000].Comment: Published in at http://dx.doi.org/10.1214/12-AAP912 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org). arXiv admin note: text
overlap with arXiv:1107.223
Optimal Joins Using Compact Data Structures
Worst-case optimal join algorithms have gained a lot of attention in the database literature. We now count with several algorithms that are optimal in the worst case, and many of them have been implemented and validated in practice. However, the implementation of these algorithms often requires an enhanced indexing structure: to achieve optimality we either need to build completely new indexes, or we must populate the database with several instantiations of indexes such as B+-trees. Either way, this means spending an extra amount of storage space that may be non-negligible.
We show that optimal algorithms can be obtained directly from a representation that regards the relations as point sets in variable-dimensional grids, without the need of extra storage. Our representation is a compact quadtree for the static indexes, and a dynamic quadtree sharing subtrees (which we dub a qdag) for intermediate results. We develop a compositional algorithm to process full join queries under this representation, and show that the running time of this algorithm is worst-case optimal in data complexity. Remarkably, we can extend our framework to evaluate more expressive queries from relational algebra by introducing a lazy version of qdags (lqdags). Once again, we can show that the running time of our algorithms is worst-case optimal
- …