11 research outputs found
Eigenvalue-based Incremental Spectral Clustering
Our previous experiments demonstrated that subsets collections of (short)
documents (with several hundred entries) share a common normalized in some way
eigenvalue spectrum of combinatorial Laplacian. Based on this insight, we
propose a method of incremental spectral clustering. The method consists of the
following steps: (1) split the data into manageable subsets, (2) cluster each
of the subsets, (3) merge clusters from different subsets based on the
eigenvalue spectrum similarity to form clusters of the entire set. This method
can be especially useful for clustering methods of complexity strongly
increasing with the size of the data sample,like in case of typical spectral
clustering. Experiments were performed showing that in fact the clustering and
merging the subsets yields clusters close to clustering the entire dataset.Comment: 14 tables, 6 figure
Incremental document map formation: multi-stage approach
The paper presents methodology for the incremental map formation in a multi-stage process of a search engine with the map based user interface1. The architecture of the experimental system allows for comparative evaluation of different constituent technologies for various stages of the process. The quality of the map generation process has been investigated based on a number of clustering and classification measures. Some conclusions concerning the impact of various technological solutions on map quality are presented
Clustering Based on Eigenvectors of the Adjacency Matrix
The paper presents a novel spectral algorithm EVSA (eigenvector structure analysis), which uses eigenvalues and eigenvectors of the adjacency matrix in order to discover clusters. Based on matrix perturbation theory and properties of graph spectra we show that the adjacency matrix can be more suitable for partitioning than other Laplacian matrices. The main problem concerning the use of the adjacency matrix is the selection of the appropriate eigenvectors. We thus propose an approach based on analysis of the adjacency matrix spectrum and eigenvector pairwise correlations. Formulated rules and heuristics allow choosing the right eigenvectors representing clusters, i.e., automatically establishing the number of groups. The algorithm requires only one parameter-the number of nearest neighbors. Unlike many other spectral methods, our solution does not need an additional clustering algorithm for final partitioning. We evaluate the proposed approach using real-world datasets of different sizes. Its performance is competitive to other both standard and new solutions, which require the number of clusters to be given as an input parameter
Eigenvalue based spectral classification.
This paper describes a new method of classification based on spectral analysis. The motivations behind developing the new model were the failures of the classical spectral cluster analysis based on combinatorial and normalized Laplacian for a set of real-world datasets of textual documents. Reasons of the failures are analysed. While the known methods are all based on usage of eigenvectors of graph Laplacians, a new classification method based on eigenvalues of graph Laplacians is proposed and studied
Eigenvalue based spectral classification
This paper describes a new method of classification based on spectral analysis. The motivations behind developing the new model were the failures of the classical spectral cluster analysis based on combinatorial and normalized Laplacian for a set of real-world datasets of textual documents. Reasons of the failures are analysed. While the known methods are all based on usage of eigenvectors of graph Laplacians, a new classification method based on eigenvalues of graph Laplacians is proposed and studied