6,907 research outputs found

    Community Detection by L0L_0-penalized Graph Laplacian

    Full text link
    Community detection in network analysis aims at partitioning nodes in a network into KK disjoint communities. Most currently available algorithms assume that KK is known, but choosing a correct KK is generally very difficult for real networks. In addition, many real networks contain outlier nodes not belonging to any community, but currently very few algorithm can handle networks with outliers. In this paper, we propose a novel model free tightness criterion and an efficient algorithm to maximize this criterion for community detection. This tightness criterion is closely related with the graph Laplacian with L0L_0 penalty. Unlike most community detection methods, our method does not require a known KK and can properly detect communities in networks with outliers. Both theoretical and numerical properties of the method are analyzed. The theoretical result guarantees that, under the degree corrected stochastic block model, even for networks with outliers, the maximizer of the tightness criterion can extract communities with small misclassification rates even when the number of communities grows to infinity as the network size grows. Simulation study shows that the proposed method can recover true communities more accurately than other methods. Applications to a college football data and a yeast protein-protein interaction data also reveal that the proposed method performs significantly better.Comment: 40 pages, 15 Postscript figure

    A spectral method for community detection in moderately-sparse degree-corrected stochastic block models

    Full text link
    We consider community detection in Degree-Corrected Stochastic Block Models (DC-SBM). We propose a spectral clustering algorithm based on a suitably normalized adjacency matrix. We show that this algorithm consistently recovers the block-membership of all but a vanishing fraction of nodes, in the regime where the lowest degree is of order log(n)(n) or higher. Recovery succeeds even for very heterogeneous degree-distributions. The used algorithm does not rely on parameters as input. In particular, it does not need to know the number of communities
    corecore