16,099 research outputs found
Compressive Spectral Clustering
Spectral clustering has become a popular technique due to its high
performance in many contexts. It comprises three main steps: create a
similarity graph between N objects to cluster, compute the first k eigenvectors
of its Laplacian matrix to define a feature vector for each object, and run
k-means on these features to separate objects into k classes. Each of these
three steps becomes computationally intensive for large N and/or k. We propose
to speed up the last two steps based on recent results in the emerging field of
graph signal processing: graph filtering of random signals, and random sampling
of bandlimited graph signals. We prove that our method, with a gain in
computation time that can reach several orders of magnitude, is in fact an
approximation of spectral clustering, for which we are able to control the
error. We test the performance of our method on artificial and real-world
network data.Comment: 12 pages, 2 figure
Accelerated Spectral Clustering Using Graph Filtering Of Random Signals
We build upon recent advances in graph signal processing to propose a faster
spectral clustering algorithm. Indeed, classical spectral clustering is based
on the computation of the first k eigenvectors of the similarity matrix'
Laplacian, whose computation cost, even for sparse matrices, becomes
prohibitive for large datasets. We show that we can estimate the spectral
clustering distance matrix without computing these eigenvectors: by graph
filtering random signals. Also, we take advantage of the stochasticity of these
random vectors to estimate the number of clusters k. We compare our method to
classical spectral clustering on synthetic data, and show that it reaches equal
performance while being faster by a factor at least two for large datasets
Covariate-assisted spectral clustering
Biological and social systems consist of myriad interacting units. The
interactions can be represented in the form of a graph or network. Measurements
of these graphs can reveal the underlying structure of these interactions,
which provides insight into the systems that generated the graphs. Moreover, in
applications such as connectomics, social networks, and genomics, graph data
are accompanied by contextualizing measures on each node. We utilize these node
covariates to help uncover latent communities in a graph, using a modification
of spectral clustering. Statistical guarantees are provided under a joint
mixture model that we call the node-contextualized stochastic blockmodel,
including a bound on the mis-clustering rate. The bound is used to derive
conditions for achieving perfect clustering. For most simulated cases,
covariate-assisted spectral clustering yields results superior to regularized
spectral clustering without node covariates and to an adaptation of canonical
correlation analysis. We apply our clustering method to large brain graphs
derived from diffusion MRI data, using the node locations or neurological
region membership as covariates. In both cases, covariate-assisted spectral
clustering yields clusters that are easier to interpret neurologically.Comment: 28 pages, 4 figures, includes substantial changes to theoretical
result
- …