13,377 research outputs found

    Spectral redemption: clustering sparse networks

    Get PDF
    Spectral algorithms are classic approaches to clustering and community detection in networks. However, for sparse networks the standard versions of these algorithms are suboptimal, in some cases completely failing to detect communities even when other algorithms such as belief propagation can do so. Here we introduce a new class of spectral algorithms based on a non-backtracking walk on the directed edges of the graph. The spectrum of this operator is much better-behaved than that of the adjacency matrix or other commonly used matrices, maintaining a strong separation between the bulk eigenvalues and the eigenvalues relevant to community structure even in the sparse case. We show that our algorithm is optimal for graphs generated by the stochastic block model, detecting communities all the way down to the theoretical limit. We also show the spectrum of the non-backtracking operator for some real-world networks, illustrating its advantages over traditional spectral clustering.Comment: 11 pages, 6 figures. Clarified to what extent our claims are rigorous, and to what extent they are conjectures; also added an interpretation of the eigenvectors of the 2n-dimensional version of the non-backtracking matri

    Spectral Thresholds in the Bipartite Stochastic Block Model

    Get PDF
    We consider a bipartite stochastic block model on vertex sets V1V_1 and V2V_2, with planted partitions in each, and ask at what densities efficient algorithms can recover the partition of the smaller vertex set. When ∣V2∣≫∣V1∣|V_2| \gg |V_1|, multiple thresholds emerge. We first locate a sharp threshold for detection of the partition, in the sense of the results of \cite{mossel2012stochastic,mossel2013proof} and \cite{massoulie2014community} for the stochastic block model. We then show that at a higher edge density, the singular vectors of the rectangular biadjacency matrix exhibit a localization / delocalization phase transition, giving recovery above the threshold and no recovery below. Nevertheless, we propose a simple spectral algorithm, Diagonal Deletion SVD, which recovers the partition at a nearly optimal edge density. The bipartite stochastic block model studied here was used by \cite{feldman2014algorithm} to give a unified algorithm for recovering planted partitions and assignments in random hypergraphs and random kk-SAT formulae respectively. Our results give the best known bounds for the clause density at which solutions can be found efficiently in these models as well as showing a barrier to further improvement via this reduction to the bipartite block model.Comment: updated version, will appear in COLT 201
    • …
    corecore