17,352 research outputs found
Methods of Hierarchical Clustering
We survey agglomerative hierarchical clustering algorithms and discuss
efficient implementations that are available in R and other software
environments. We look at hierarchical self-organizing maps, and mixture models.
We review grid-based clustering, focusing on hierarchical density-based
approaches. Finally we describe a recently developed very efficient (linear
time) hierarchical clustering algorithm, which can also be viewed as a
hierarchical grid-based algorithm.Comment: 21 pages, 2 figures, 1 table, 69 reference
Squarepants in a Tree: Sum of Subtree Clustering and Hyperbolic Pants Decomposition
We provide efficient constant factor approximation algorithms for the
problems of finding a hierarchical clustering of a point set in any metric
space, minimizing the sum of minimimum spanning tree lengths within each
cluster, and in the hyperbolic or Euclidean planes, minimizing the sum of
cluster perimeters. Our algorithms for the hyperbolic and Euclidean planes can
also be used to provide a pants decomposition, that is, a set of disjoint
simple closed curves partitioning the plane minus the input points into subsets
with exactly three boundary components, with approximately minimum total
length. In the Euclidean case, these curves are squares; in the hyperbolic
case, they combine our Euclidean square pants decomposition with our tree
clustering method for general metric spaces.Comment: 22 pages, 14 figures. This version replaces the proof of what is now
Lemma 5.2, as the previous proof was erroneou
- …