7,567 research outputs found
Spectral Detection on Sparse Hypergraphs
We consider the problem of the assignment of nodes into communities from a
set of hyperedges, where every hyperedge is a noisy observation of the
community assignment of the adjacent nodes. We focus in particular on the
sparse regime where the number of edges is of the same order as the number of
vertices. We propose a spectral method based on a generalization of the
non-backtracking Hashimoto matrix into hypergraphs. We analyze its performance
on a planted generative model and compare it with other spectral methods and
with Bayesian belief propagation (which was conjectured to be asymptotically
optimal for this model). We conclude that the proposed spectral method detects
communities whenever belief propagation does, while having the important
advantages to be simpler, entirely nonparametric, and to be able to learn the
rule according to which the hyperedges were generated without prior
information.Comment: 8 pages, 5 figure
MMSE of probabilistic low-rank matrix estimation: Universality with respect to the output channel
This paper considers probabilistic estimation of a low-rank matrix from
non-linear element-wise measurements of its elements. We derive the
corresponding approximate message passing (AMP) algorithm and its state
evolution. Relying on non-rigorous but standard assumptions motivated by
statistical physics, we characterize the minimum mean squared error (MMSE)
achievable information theoretically and with the AMP algorithm. Unlike in
related problems of linear estimation, in the present setting the MMSE depends
on the output channel only trough a single parameter - its Fisher information.
We illustrate this striking finding by analysis of submatrix localization, and
of detection of communities hidden in a dense stochastic block model. For this
example we locate the computational and statistical boundaries that are not
equal for rank larger than four.Comment: 10 pages, Allerton Conference on Communication, Control, and
Computing 201
Community detection and stochastic block models: recent developments
The stochastic block model (SBM) is a random graph model with planted
clusters. It is widely employed as a canonical model to study clustering and
community detection, and provides generally a fertile ground to study the
statistical and computational tradeoffs that arise in network and data
sciences.
This note surveys the recent developments that establish the fundamental
limits for community detection in the SBM, both with respect to
information-theoretic and computational thresholds, and for various recovery
requirements such as exact, partial and weak recovery (a.k.a., detection). The
main results discussed are the phase transitions for exact recovery at the
Chernoff-Hellinger threshold, the phase transition for weak recovery at the
Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial
recovery, the learning of the SBM parameters and the gap between
information-theoretic and computational thresholds.
The note also covers some of the algorithms developed in the quest of
achieving the limits, in particular two-round algorithms via graph-splitting,
semi-definite programming, linearized belief propagation, classical and
nonbacktracking spectral methods. A few open problems are also discussed
Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization
We study the problem of detecting a structured, low-rank signal matrix
corrupted with additive Gaussian noise. This includes clustering in a Gaussian
mixture model, sparse PCA, and submatrix localization. Each of these problems
is conjectured to exhibit a sharp information-theoretic threshold, below which
the signal is too weak for any algorithm to detect. We derive upper and lower
bounds on these thresholds by applying the first and second moment methods to
the likelihood ratio between these "planted models" and null models where the
signal matrix is zero. Our bounds differ by at most a factor of root two when
the rank is large (in the clustering and submatrix localization problems, when
the number of clusters or blocks is large) or the signal matrix is very sparse.
Moreover, our upper bounds show that for each of these problems there is a
significant regime where reliable detection is information- theoretically
possible but where known algorithms such as PCA fail completely, since the
spectrum of the observed matrix is uninformative. This regime is analogous to
the conjectured 'hard but detectable' regime for community detection in sparse
graphs.Comment: For sparse PCA and submatrix localization, we determine the
information-theoretic threshold exactly in the limit where the number of
blocks is large or the signal matrix is very sparse based on a conditional
second moment method, closing the factor of root two gap in the first versio
Matrix Completion from Fewer Entries: Spectral Detectability and Rank Estimation
The completion of low rank matrices from few entries is a task with many
practical applications. We consider here two aspects of this problem:
detectability, i.e. the ability to estimate the rank reliably from the
fewest possible random entries, and performance in achieving small
reconstruction error. We propose a spectral algorithm for these two tasks
called MaCBetH (for Matrix Completion with the Bethe Hessian). The rank is
estimated as the number of negative eigenvalues of the Bethe Hessian matrix,
and the corresponding eigenvectors are used as initial condition for the
minimization of the discrepancy between the estimated matrix and the revealed
entries. We analyze the performance in a random matrix setting using results
from the statistical mechanics of the Hopfield neural network, and show in
particular that MaCBetH efficiently detects the rank of a large
matrix from entries, where is a constant close to .
We also evaluate the corresponding root-mean-square error empirically and show
that MaCBetH compares favorably to other existing approaches.Comment: NIPS Conference 201
- …