37,373 research outputs found
Community detection and stochastic block models: recent developments
The stochastic block model (SBM) is a random graph model with planted
clusters. It is widely employed as a canonical model to study clustering and
community detection, and provides generally a fertile ground to study the
statistical and computational tradeoffs that arise in network and data
sciences.
This note surveys the recent developments that establish the fundamental
limits for community detection in the SBM, both with respect to
information-theoretic and computational thresholds, and for various recovery
requirements such as exact, partial and weak recovery (a.k.a., detection). The
main results discussed are the phase transitions for exact recovery at the
Chernoff-Hellinger threshold, the phase transition for weak recovery at the
Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial
recovery, the learning of the SBM parameters and the gap between
information-theoretic and computational thresholds.
The note also covers some of the algorithms developed in the quest of
achieving the limits, in particular two-round algorithms via graph-splitting,
semi-definite programming, linearized belief propagation, classical and
nonbacktracking spectral methods. A few open problems are also discussed
Spectral Detection on Sparse Hypergraphs
We consider the problem of the assignment of nodes into communities from a
set of hyperedges, where every hyperedge is a noisy observation of the
community assignment of the adjacent nodes. We focus in particular on the
sparse regime where the number of edges is of the same order as the number of
vertices. We propose a spectral method based on a generalization of the
non-backtracking Hashimoto matrix into hypergraphs. We analyze its performance
on a planted generative model and compare it with other spectral methods and
with Bayesian belief propagation (which was conjectured to be asymptotically
optimal for this model). We conclude that the proposed spectral method detects
communities whenever belief propagation does, while having the important
advantages to be simpler, entirely nonparametric, and to be able to learn the
rule according to which the hyperedges were generated without prior
information.Comment: 8 pages, 5 figure
Super-resolution community detection for layer-aggregated multilayer networks
Applied network science often involves preprocessing network data before
applying a network-analysis method, and there is typically a theoretical
disconnect between these steps. For example, it is common to aggregate
time-varying network data into windows prior to analysis, and the tradeoffs of
this preprocessing are not well understood. Focusing on the problem of
detecting small communities in multilayer networks, we study the effects of
layer aggregation by developing random-matrix theory for modularity matrices
associated with layer-aggregated networks with nodes and layers, which
are drawn from an ensemble of Erd\H{o}s-R\'enyi networks. We study phase
transitions in which eigenvectors localize onto communities (allowing their
detection) and which occur for a given community provided its size surpasses a
detectability limit . When layers are aggregated via a summation, we
obtain , where is the number of
layers across which the community persists. Interestingly, if is allowed to
vary with then summation-based layer aggregation enhances small-community
detection even if the community persists across a vanishing fraction of layers,
provided that decays more slowly than . Moreover,
we find that thresholding the summation can in some cases cause to decay
exponentially, decreasing by orders of magnitude in a phenomenon we call
super-resolution community detection. That is, layer aggregation with
thresholding is a nonlinear data filter enabling detection of communities that
are otherwise too small to detect. Importantly, different thresholds generally
enhance the detectability of communities having different properties,
illustrating that community detection can be obscured if one analyzes network
data using a single threshold.Comment: 11 pages, 8 figure
Matrix Completion from Fewer Entries: Spectral Detectability and Rank Estimation
The completion of low rank matrices from few entries is a task with many
practical applications. We consider here two aspects of this problem:
detectability, i.e. the ability to estimate the rank reliably from the
fewest possible random entries, and performance in achieving small
reconstruction error. We propose a spectral algorithm for these two tasks
called MaCBetH (for Matrix Completion with the Bethe Hessian). The rank is
estimated as the number of negative eigenvalues of the Bethe Hessian matrix,
and the corresponding eigenvectors are used as initial condition for the
minimization of the discrepancy between the estimated matrix and the revealed
entries. We analyze the performance in a random matrix setting using results
from the statistical mechanics of the Hopfield neural network, and show in
particular that MaCBetH efficiently detects the rank of a large
matrix from entries, where is a constant close to .
We also evaluate the corresponding root-mean-square error empirically and show
that MaCBetH compares favorably to other existing approaches.Comment: NIPS Conference 201
- …