Search CORE

17,732 research outputs found

Adaptive Evolutionary Clustering

Author: AC Harvey
Alfred O. Hero III
DJ Fenn
GW Milligan
H Lütkepohl
H Ning
HW Kuhn
J Schäfer
J Shi
Kevin S. Xu
M Charikar
Mark Kliger
N Eagle
O Ledoit
PJ Mucha
S Haykin
S Tadepalli
T Hastie
T Yang
TW Anderson
U Luxburg von
Y Chen
Y Chi
YR Lin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

In many practical applications of clustering, the objects to be clustered evolve over time, and a clustering result is desired at each time step. In such applications, evolutionary clustering typically outperforms traditional static clustering by producing clustering results that reflect long-term trends while being robust to short-term variations. Several evolutionary clustering algorithms have recently been proposed, often by adding a temporal smoothness penalty to the cost function of a static clustering method. In this paper, we introduce a different approach to evolutionary clustering by accurately tracking the time-varying proximities between objects followed by static clustering. We present an evolutionary clustering framework that adaptively estimates the optimal smoothing parameter using shrinkage estimation, a statistical approach that improves a naive estimate using additional information. The proposed framework can be used to extend a variety of static clustering algorithms, including hierarchical, k-means, and spectral clustering, into evolutionary clustering algorithms. Experiments on synthetic and real data sets indicate that the proposed framework outperforms static clustering and existing evolutionary clustering algorithms in many scenarios.Comment: To appear in Data Mining and Knowledge Discovery, MATLAB toolbox available at http://tbayes.eecs.umich.edu/xukevin/affec

arXiv.org e-Print Archive

CiteSeerX

Crossref

Natural data structure extracted from neighborhood-similarity graphs

Author: Kanders Karlis
Lorimer Tom
Stoop Ruedi
Publication venue: 'Elsevier BV'
Publication date: 15/02/2018
Field of study

'Big' high-dimensional data are commonly analyzed in low-dimensions, after performing a dimensionality-reduction step that inherently distorts the data structure. For the same purpose, clustering methods are also often used. These methods also introduce a bias, either by starting from the assumption of a particular geometric form of the clusters, or by using iterative schemes to enhance cluster contours, with uncontrollable consequences. The goal of data analysis should, however, be to encode and detect structural data features at all scales and densities simultaneously, without assuming a parametric form of data point distances, or modifying them. We propose a novel approach that directly encodes data point neighborhood similarities as a sparse graph. Our non-iterative framework permits a transparent interpretation of data, without altering the original data dimension and metric. Several natural and synthetic data applications demonstrate the efficacy of our novel approach

arXiv.org e-Print Archive

ZORA