28 research outputs found

    Clustering from Sparse Pairwise Measurements

    Get PDF
    We consider the problem of grouping items into clusters based on few random pairwise comparisons between the items. We introduce three closely related algorithms for this task: a belief propagation algorithm approximating the Bayes optimal solution, and two spectral algorithms based on the non-backtracking and Bethe Hessian operators. For the case of two symmetric clusters, we conjecture that these algorithms are asymptotically optimal in that they detect the clusters as soon as it is information theoretically possible to do so. We substantiate this claim for one of the spectral approaches we introduce

    Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results

    Get PDF
    The classical setting of community detection consists of networks exhibiting a clustered structure. To more accurately model real systems we consider a class of networks (i) whose edges may carry labels and (ii) which may lack a clustered structure. Specifically we assume that nodes possess latent attributes drawn from a general compact space and edges between two nodes are randomly generated and labeled according to some unknown distribution as a function of their latent attributes. Our goal is then to infer the edge label distributions from a partially observed network. We propose a computationally efficient spectral algorithm and show it allows for asymptotically correct inference when the average node degree could be as low as logarithmic in the total number of nodes. Conversely, if the average node degree is below a specific constant threshold, we show that no algorithm can achieve better inference than guessing without using the observations. As a byproduct of our analysis, we show that our model provides a general procedure to construct random graph models with a spectrum asymptotic to a pre-specified eigenvalue distribution such as a power-law distribution.Comment: 17 page

    Community detection thresholds and the weak Ramanujan property

    Get PDF
    Decelle et al.\cite{Decelle11} conjectured the existence of a sharp threshold for community detection in sparse random graphs drawn from the stochastic block model. Mossel et al.\cite{Mossel12} established the negative part of the conjecture, proving impossibility of meaningful detection below the threshold. However the positive part of the conjecture remained elusive so far. Here we solve the positive part of the conjecture. We introduce a modified adjacency matrix BB that counts self-avoiding paths of a given length ℓ\ell between pairs of nodes and prove that for logarithmic ℓ\ell, the leading eigenvectors of this modified matrix provide non-trivial detection, thereby settling the conjecture. A key step in the proof consists in establishing a {\em weak Ramanujan property} of matrix BB. Namely, the spectrum of BB consists in two leading eigenvalues ρ(B)\rho(B), λ2\lambda_2 and n−2n-2 eigenvalues of a lower order O(nϔρ(B))O(n^{\epsilon}\sqrt{\rho(B)}) for all Ï”>0\epsilon>0, ρ(B)\rho(B) denoting BB's spectral radius. dd-regular graphs are Ramanujan when their second eigenvalue verifies âˆŁÎ»âˆŁâ‰€2d−1|\lambda|\le 2 \sqrt{d-1}. Random dd-regular graphs have a second largest eigenvalue λ\lambda of 2d−1+o(1)2\sqrt{d-1}+o(1) (see Friedman\cite{friedman08}), thus being {\em almost} Ramanujan. Erd\H{o}s-R\'enyi graphs with average degree dd at least logarithmic (d=Ω(log⁥n)d=\Omega(\log n)) have a second eigenvalue of O(d)O(\sqrt{d}) (see Feige and Ofek\cite{Feige05}), a slightly weaker version of the Ramanujan property. However this spectrum separation property fails for sparse (d=O(1)d=O(1)) Erd\H{o}s-R\'enyi graphs. Our result thus shows that by constructing matrix BB through neighborhood expansion, we regularize the original adjacency matrix to eventually recover a weak form of the Ramanujan property
    corecore