Search CORE

37,373 research outputs found

Community detection and stochastic block models: recent developments

Author: Abbe Emmanuel
Publication venue
Publication date: 29/03/2017
Field of study

The stochastic block model (SBM) is a random graph model with planted clusters. It is widely employed as a canonical model to study clustering and community detection, and provides generally a fertile ground to study the statistical and computational tradeoffs that arise in network and data sciences. This note surveys the recent developments that establish the fundamental limits for community detection in the SBM, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery (a.k.a., detection). The main results discussed are the phase transitions for exact recovery at the Chernoff-Hellinger threshold, the phase transition for weak recovery at the Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial recovery, the learning of the SBM parameters and the gap between information-theoretic and computational thresholds. The note also covers some of the algorithms developed in the quest of achieving the limits, in particular two-round algorithms via graph-splitting, semi-definite programming, linearized belief propagation, classical and nonbacktracking spectral methods. A few open problems are also discussed

arXiv.org e-Print Archive

Spectral Detection on Sparse Hypergraphs

Author: Angelini Maria Chiara
Caltagirone Francesco
Krzakala Florent
Zdeborová Lenka
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

We consider the problem of the assignment of nodes into communities from a set of hyperedges, where every hyperedge is a noisy observation of the community assignment of the adjacent nodes. We focus in particular on the sparse regime where the number of edges is of the same order as the number of vertices. We propose a spectral method based on a generalization of the non-backtracking Hashimoto matrix into hypergraphs. We analyze its performance on a planted generative model and compare it with other spectral methods and with Bayesian belief propagation (which was conjectured to be asymptotically optimal for this model). We conclude that the proposed spectral method detects communities whenever belief propagation does, while having the important advantages to be simpler, entirely nonparametric, and to be able to learn the rule according to which the hyperedges were generated without prior information.Comment: 8 pages, 5 figure

arXiv.org e-Print Archive

HAL-CEA

Archivio della ricerca- Università di Roma La Sapienza

Hal-Diderot

Super-resolution community detection for layer-aggregated multilayer networks

Author: Caceres Rajmonda S.
Mucha Peter J.
Taylor Dane
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2017
Field of study

Applied network science often involves preprocessing network data before applying a network-analysis method, and there is typically a theoretical disconnect between these steps. For example, it is common to aggregate time-varying network data into windows prior to analysis, and the tradeoffs of this preprocessing are not well understood. Focusing on the problem of detecting small communities in multilayer networks, we study the effects of layer aggregation by developing random-matrix theory for modularity matrices associated with layer-aggregated networks with

N

nodes and

L

layers, which are drawn from an ensemble of Erd\H{o}s-R\'enyi networks. We study phase transitions in which eigenvectors localize onto communities (allowing their detection) and which occur for a given community provided its size surpasses a detectability limit

K^*

. When layers are aggregated via a summation, we obtain

K^*\varpropto \mathcal{O}(\sqrt{NL}/T)

, where

T

is the number of layers across which the community persists. Interestingly, if

T

is allowed to vary with

L

then summation-based layer aggregation enhances small-community detection even if the community persists across a vanishing fraction of layers, provided that

T/L

decays more slowly than

\mathcal{O}(L^{-1/2})

. Moreover, we find that thresholding the summation can in some cases cause

K^*

to decay exponentially, decreasing by orders of magnitude in a phenomenon we call super-resolution community detection. That is, layer aggregation with thresholding is a nonlinear data filter enabling detection of communities that are otherwise too small to detect. Importantly, different thresholds generally enhance the detectability of communities having different properties, illustrating that community detection can be obscured if one analyzes network data using a single threshold.Comment: 11 pages, 8 figure

arXiv.org e-Print Archive

DSpace@MIT

Directory of Open Access Journals

Carolina Digital Repository

Matrix Completion from Fewer Entries: Spectral Detectability and Rank Estimation

Author: Krzakala Florent
Saade Alaa
Zdeborová Lenka
Publication venue
Publication date: 01/01/2015
Field of study

The completion of low rank matrices from few entries is a task with many practical applications. We consider here two aspects of this problem: detectability, i.e. the ability to estimate the rank

r

reliably from the fewest possible random entries, and performance in achieving small reconstruction error. We propose a spectral algorithm for these two tasks called MaCBetH (for Matrix Completion with the Bethe Hessian). The rank is estimated as the number of negative eigenvalues of the Bethe Hessian matrix, and the corresponding eigenvectors are used as initial condition for the minimization of the discrepancy between the estimated matrix and the revealed entries. We analyze the performance in a random matrix setting using results from the statistical mechanics of the Hopfield neural network, and show in particular that MaCBetH efficiently detects the rank

r

of a large

n\times m

matrix from

C(r)r\sqrt{nm}

entries, where

C(r)

is a constant close to

1

. We also evaluate the corresponding root-mean-square error empirically and show that MaCBetH compares favorably to other existing approaches.Comment: NIPS Conference 201

arXiv.org e-Print Archive

CiteSeerX

HAL-CEA