10,572 research outputs found
Consistency of spectral clustering in stochastic block models
We analyze the performance of spectral clustering for community extraction in
stochastic block models. We show that, under mild conditions, spectral
clustering applied to the adjacency matrix of the network can consistently
recover hidden communities even when the order of the maximum expected degree
is as small as , with the number of nodes. This result applies to
some popular polynomial time spectral clustering algorithms and is further
extended to degree corrected stochastic block models using a spherical
-median spectral clustering method. A key component of our analysis is a
combinatorial bound on the spectrum of binary random matrices, which is sharper
than the conventional matrix Bernstein inequality and may be of independent
interest.Comment: Published in at http://dx.doi.org/10.1214/14-AOS1274 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Community detection and stochastic block models: recent developments
The stochastic block model (SBM) is a random graph model with planted
clusters. It is widely employed as a canonical model to study clustering and
community detection, and provides generally a fertile ground to study the
statistical and computational tradeoffs that arise in network and data
sciences.
This note surveys the recent developments that establish the fundamental
limits for community detection in the SBM, both with respect to
information-theoretic and computational thresholds, and for various recovery
requirements such as exact, partial and weak recovery (a.k.a., detection). The
main results discussed are the phase transitions for exact recovery at the
Chernoff-Hellinger threshold, the phase transition for weak recovery at the
Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial
recovery, the learning of the SBM parameters and the gap between
information-theoretic and computational thresholds.
The note also covers some of the algorithms developed in the quest of
achieving the limits, in particular two-round algorithms via graph-splitting,
semi-definite programming, linearized belief propagation, classical and
nonbacktracking spectral methods. A few open problems are also discussed
- …