27,464 research outputs found
Quantized Compressive K-Means
The recent framework of compressive statistical learning aims at designing
tractable learning algorithms that use only a heavily compressed
representation-or sketch-of massive datasets. Compressive K-Means (CKM) is such
a method: it estimates the centroids of data clusters from pooled, non-linear,
random signatures of the learning examples. While this approach significantly
reduces computational time on very large datasets, its digital implementation
wastes acquisition resources because the learning examples are compressed only
after the sensing stage. The present work generalizes the sketching procedure
initially defined in Compressive K-Means to a large class of periodic
nonlinearities including hardware-friendly implementations that compressively
acquire entire datasets. This idea is exemplified in a Quantized Compressive
K-Means procedure, a variant of CKM that leverages 1-bit universal quantization
(i.e. retaining the least significant bit of a standard uniform quantizer) as
the periodic sketch nonlinearity. Trading for this resource-efficient signature
(standard in most acquisition schemes) has almost no impact on the clustering
performances, as illustrated by numerical experiments
Methods of Hierarchical Clustering
We survey agglomerative hierarchical clustering algorithms and discuss
efficient implementations that are available in R and other software
environments. We look at hierarchical self-organizing maps, and mixture models.
We review grid-based clustering, focusing on hierarchical density-based
approaches. Finally we describe a recently developed very efficient (linear
time) hierarchical clustering algorithm, which can also be viewed as a
hierarchical grid-based algorithm.Comment: 21 pages, 2 figures, 1 table, 69 reference
- …