303 research outputs found

    Community detection thresholds and the weak Ramanujan property

    Get PDF
    Decelle et al.\cite{Decelle11} conjectured the existence of a sharp threshold for community detection in sparse random graphs drawn from the stochastic block model. Mossel et al.\cite{Mossel12} established the negative part of the conjecture, proving impossibility of meaningful detection below the threshold. However the positive part of the conjecture remained elusive so far. Here we solve the positive part of the conjecture. We introduce a modified adjacency matrix BB that counts self-avoiding paths of a given length \ell between pairs of nodes and prove that for logarithmic \ell, the leading eigenvectors of this modified matrix provide non-trivial detection, thereby settling the conjecture. A key step in the proof consists in establishing a {\em weak Ramanujan property} of matrix BB. Namely, the spectrum of BB consists in two leading eigenvalues ρ(B)\rho(B), λ2\lambda_2 and n2n-2 eigenvalues of a lower order O(nϵρ(B))O(n^{\epsilon}\sqrt{\rho(B)}) for all ϵ>0\epsilon>0, ρ(B)\rho(B) denoting BB's spectral radius. dd-regular graphs are Ramanujan when their second eigenvalue verifies λ2d1|\lambda|\le 2 \sqrt{d-1}. Random dd-regular graphs have a second largest eigenvalue λ\lambda of 2d1+o(1)2\sqrt{d-1}+o(1) (see Friedman\cite{friedman08}), thus being {\em almost} Ramanujan. Erd\H{o}s-R\'enyi graphs with average degree dd at least logarithmic (d=Ω(logn)d=\Omega(\log n)) have a second eigenvalue of O(d)O(\sqrt{d}) (see Feige and Ofek\cite{Feige05}), a slightly weaker version of the Ramanujan property. However this spectrum separation property fails for sparse (d=O(1)d=O(1)) Erd\H{o}s-R\'enyi graphs. Our result thus shows that by constructing matrix BB through neighborhood expansion, we regularize the original adjacency matrix to eventually recover a weak form of the Ramanujan property

    Projected Power Iteration for Network Alignment

    Full text link
    The network alignment problem asks for the best correspondence between two given graphs, so that the largest possible number of edges are matched. This problem appears in many scientific problems (like the study of protein-protein interactions) and it is very closely related to the quadratic assignment problem which has graph isomorphism, traveling salesman and minimum bisection problems as particular cases. The graph matching problem is NP-hard in general. However, under some restrictive models for the graphs, algorithms can approximate the alignment efficiently. In that spirit the recent work by Feizi and collaborators introduce EigenAlign, a fast spectral method with convergence guarantees for Erd\H{o}s-Reny\'i graphs. In this work we propose the algorithm Projected Power Alignment, which is a projected power iteration version of EigenAlign. We numerically show it improves the recovery rates of EigenAlign and we describe the theory that may be used to provide performance guarantees for Projected Power Alignment.Comment: 8 page

    Non-Backtracking Spectrum of Degree-Corrected Stochastic Block Models

    Full text link
    Motivated by community detection, we characterise the spectrum of the non-backtracking matrix BB in the Degree-Corrected Stochastic Block Model. Specifically, we consider a random graph on nn vertices partitioned into two equal-sized clusters. The vertices have i.i.d. weights {ϕu}u=1n\{ \phi_u \}_{u=1}^n with second moment Φ(2)\Phi^{(2)}. The intra-cluster connection probability for vertices uu and vv is ϕuϕvna\frac{\phi_u \phi_v}{n}a and the inter-cluster connection probability is ϕuϕvnb\frac{\phi_u \phi_v}{n}b. We show that with high probability, the following holds: The leading eigenvalue of the non-backtracking matrix BB is asymptotic to ρ=a+b2Φ(2)\rho = \frac{a+b}{2} \Phi^{(2)}. The second eigenvalue is asymptotic to μ2=ab2Φ(2)\mu_2 = \frac{a-b}{2} \Phi^{(2)} when μ22>ρ\mu_2^2 > \rho, but asymptotically bounded by ρ\sqrt{\rho} when μ22ρ\mu_2^2 \leq \rho. All the remaining eigenvalues are asymptotically bounded by ρ\sqrt{\rho}. As a result, a clustering positively-correlated with the true communities can be obtained based on the second eigenvector of BB in the regime where μ22>ρ.\mu_2^2 > \rho. In a previous work we obtained that detection is impossible when μ22<ρ,\mu_2^2 < \rho, meaning that there occurs a phase-transition in the sparse regime of the Degree-Corrected Stochastic Block Model. As a corollary, we obtain that Degree-Corrected Erd\H{o}s-R\'enyi graphs asymptotically satisfy the graph Riemann hypothesis, a quasi-Ramanujan property. A by-product of our proof is a weak law of large numbers for local-functionals on Degree-Corrected Stochastic Block Models, which could be of independent interest

    Inference on graphs via semidefinite programming

    Get PDF
    Inference problems on graphs arise naturally when trying to make sense of network data. Oftentimes, these problems are formulated as intractable optimization programs. This renders the need for fast heuristics to find adequate solutions and for the study of their performance. For a certain class of problems, Javanmard et al. (1) successfully use tools from statistical physics to analyze the performance of semidefinite programming relaxations, an important heuristic for intractable problems.National Science Foundation (U.S.) (Grant DMS- 1317308

    Community detection and stochastic block models: recent developments

    Full text link
    The stochastic block model (SBM) is a random graph model with planted clusters. It is widely employed as a canonical model to study clustering and community detection, and provides generally a fertile ground to study the statistical and computational tradeoffs that arise in network and data sciences. This note surveys the recent developments that establish the fundamental limits for community detection in the SBM, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery (a.k.a., detection). The main results discussed are the phase transitions for exact recovery at the Chernoff-Hellinger threshold, the phase transition for weak recovery at the Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial recovery, the learning of the SBM parameters and the gap between information-theoretic and computational thresholds. The note also covers some of the algorithms developed in the quest of achieving the limits, in particular two-round algorithms via graph-splitting, semi-definite programming, linearized belief propagation, classical and nonbacktracking spectral methods. A few open problems are also discussed
    corecore