Search CORE

2,564 research outputs found

MMSE of probabilistic low-rank matrix estimation: Universality with respect to the output channel

Author: Krzakala Florent
Lesieur Thibault
Zdeborová Lenka
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/10/2015
Field of study

This paper considers probabilistic estimation of a low-rank matrix from non-linear element-wise measurements of its elements. We derive the corresponding approximate message passing (AMP) algorithm and its state evolution. Relying on non-rigorous but standard assumptions motivated by statistical physics, we characterize the minimum mean squared error (MMSE) achievable information theoretically and with the AMP algorithm. Unlike in related problems of linear estimation, in the present setting the MMSE depends on the output channel only trough a single parameter - its Fisher information. We illustrate this striking finding by analysis of submatrix localization, and of detection of communities hidden in a dense stochastic block model. For this example we locate the computational and statistical boundaries that are not equal for rank larger than four.Comment: 10 pages, Allerton Conference on Communication, Control, and Computing 201

arXiv.org e-Print Archive

Crossref

HAL-CEA

Bayesian stochastic blockmodeling

Author: Airoldi E. M.
Catherine Matias
Erdős P.
Jeffreys H.
MacKay D. J. C.
Newman M. E. J.
Shtar'kov Y. M.
Yan X.
Publication venue: 'Wiley'
Publication date: 06/02/2020
Field of study

This chapter provides a self-contained introduction to the use of Bayesian inference to extract large-scale modular structures from network data, based on the stochastic blockmodel (SBM), as well as its degree-corrected and overlapping generalizations. We focus on nonparametric formulations that allow their inference in a manner that prevents overfitting, and enables model selection. We discuss aspects of the choice of priors, in particular how to avoid underfitting via increased Bayesian hierarchies, and we contrast the task of sampling network partitions from the posterior distribution with finding the single point estimate that maximizes it, while describing efficient algorithms to perform either one. We also show how inferring the SBM can be used to predict missing and spurious links, and shed light on the fundamental limitations of the detectability of modular structures in networks.Comment: 44 pages, 16 figures. Code is freely available as part of graph-tool at https://graph-tool.skewed.de . See also the HOWTO at https://graph-tool.skewed.de/static/doc/demos/inference/inference.htm

arXiv.org e-Print Archive

Crossref

Spectral Clustering of Graphs with the Bethe Hessian

Author: Krzakala Florent
Saade Alaa
Zdeborová Lenka
Publication venue
Publication date: 08/09/2014
Field of study

Spectral clustering is a standard approach to label nodes on a graph by studying the (largest or lowest) eigenvalues of a symmetric real matrix such as e.g. the adjacency or the Laplacian. Recently, it has been argued that using instead a more complicated, non-symmetric and higher dimensional operator, related to the non-backtracking walk on the graph, leads to improved performance in detecting clusters, and even to optimal performance for the stochastic block model. Here, we propose to use instead a simpler object, a symmetric real matrix known as the Bethe Hessian operator, or deformed Laplacian. We show that this approach combines the performances of the non-backtracking operator, thus detecting clusters all the way down to the theoretical limit in the stochastic block model, with the computational, theoretical and memory advantages of real symmetric matrices.Comment: 8 pages, 2 figure

arXiv.org e-Print Archive

HAL-CEA

Finding communities in sparse networks

Author: Humphries Mark
Singh Abhinav
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Spectral algorithms based on matrix representations of networks are often used to detect communities but classic spectral methods based on the adjacency matrix and its variants fail to detect communities in sparse networks. New spectral methods based on non-backtracking random walks have recently been introduced that successfully detect communities in many sparse networks. However, the spectrum of non-backtracking random walks ignores hanging trees in networks that can contain information about the community structure of networks. We introduce the reluctant backtracking operators that explicitly account for hanging trees as they admit a small probability of returning to the immediately previous node unlike the non-backtracking operators that forbid an immediate return. We show that the reluctant backtracking operators can detect communities in certain sparse networks where the non-backtracking operators cannot while performing comparably on benchmark stochastic block model networks and real world networks. We also show that the spectrum of the reluctant backtracking operator approximately optimises the standard modularity function similar to the flow matrix. Interestingly, for this family of non- and reluctant-backtracking operators the main determinant of performance on real-world networks is whether or not they are normalised to conserve probability at each node.Comment: 11 pages, 4 figure

arXiv.org e-Print Archive

PubMed Central

The University of Manchester - Institutional Repository

Recovery, detection and confidence sets of communities in a sparse stochastic block model

Author: Kleijn B. J. K.
van Waaij J.
Publication venue
Publication date: 22/10/2018
Field of study

Posterior distributions for community assignment in the planted bi-section model are shown to achieve frequentist exact recovery and detection under sharp lower bounds on sparsity. Assuming posterior recovery (or detection), one may interpret credible sets (or enlarged credible sets) as consistent confidence sets. If credible levels grow to one quickly enough, credible sets can be interpreted as frequentist confidence sets without conditions on the parameters. In the regime where within-class and between-class edge-probabilities are very close, credible sets may be enlarged to achieve frequentist asymptotic coverage. The diameters of credible sets are controlled and match rates of posterior convergence.Comment: 22 pp., 2 fi

arXiv.org e-Print Archive

Online Research Database In Technology