28 research outputs found
Clustering from Sparse Pairwise Measurements
We consider the problem of grouping items into clusters based on few random
pairwise comparisons between the items. We introduce three closely related
algorithms for this task: a belief propagation algorithm approximating the
Bayes optimal solution, and two spectral algorithms based on the
non-backtracking and Bethe Hessian operators. For the case of two symmetric
clusters, we conjecture that these algorithms are asymptotically optimal in
that they detect the clusters as soon as it is information theoretically
possible to do so. We substantiate this claim for one of the spectral
approaches we introduce
Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results
The classical setting of community detection consists of networks exhibiting
a clustered structure. To more accurately model real systems we consider a
class of networks (i) whose edges may carry labels and (ii) which may lack a
clustered structure. Specifically we assume that nodes possess latent
attributes drawn from a general compact space and edges between two nodes are
randomly generated and labeled according to some unknown distribution as a
function of their latent attributes. Our goal is then to infer the edge label
distributions from a partially observed network. We propose a computationally
efficient spectral algorithm and show it allows for asymptotically correct
inference when the average node degree could be as low as logarithmic in the
total number of nodes. Conversely, if the average node degree is below a
specific constant threshold, we show that no algorithm can achieve better
inference than guessing without using the observations. As a byproduct of our
analysis, we show that our model provides a general procedure to construct
random graph models with a spectrum asymptotic to a pre-specified eigenvalue
distribution such as a power-law distribution.Comment: 17 page
Community detection thresholds and the weak Ramanujan property
Decelle et al.\cite{Decelle11} conjectured the existence of a sharp threshold
for community detection in sparse random graphs drawn from the stochastic block
model. Mossel et al.\cite{Mossel12} established the negative part of the
conjecture, proving impossibility of meaningful detection below the threshold.
However the positive part of the conjecture remained elusive so far. Here we
solve the positive part of the conjecture. We introduce a modified adjacency
matrix that counts self-avoiding paths of a given length between
pairs of nodes and prove that for logarithmic , the leading eigenvectors
of this modified matrix provide non-trivial detection, thereby settling the
conjecture. A key step in the proof consists in establishing a {\em weak
Ramanujan property} of matrix . Namely, the spectrum of consists in two
leading eigenvalues , and eigenvalues of a lower
order for all , denoting
's spectral radius. -regular graphs are Ramanujan when their second
eigenvalue verifies . Random -regular graphs have
a second largest eigenvalue of (see
Friedman\cite{friedman08}), thus being {\em almost} Ramanujan.
Erd\H{o}s-R\'enyi graphs with average degree at least logarithmic
() have a second eigenvalue of (see Feige and
Ofek\cite{Feige05}), a slightly weaker version of the Ramanujan property.
However this spectrum separation property fails for sparse ()
Erd\H{o}s-R\'enyi graphs. Our result thus shows that by constructing matrix
through neighborhood expansion, we regularize the original adjacency matrix to
eventually recover a weak form of the Ramanujan property