10 research outputs found

    A semidefinite program for unbalanced multisection in the stochastic block model

    Full text link
    We propose a semidefinite programming (SDP) algorithm for community detection in the stochastic block model, a popular model for networks with latent community structure. We prove that our algorithm achieves exact recovery of the latent communities, up to the information-theoretic limits determined by Abbe and Sandon (2015). Our result extends prior SDP approaches by allowing for many communities of different sizes. By virtue of a semidefinite approach, our algorithms succeed against a semirandom variant of the stochastic block model, guaranteeing a form of robustness and generalization. We further explore how semirandom models can lend insight into both the strengths and limitations of SDPs in this setting.Comment: 29 page

    Community detection and stochastic block models: recent developments

    Full text link
    The stochastic block model (SBM) is a random graph model with planted clusters. It is widely employed as a canonical model to study clustering and community detection, and provides generally a fertile ground to study the statistical and computational tradeoffs that arise in network and data sciences. This note surveys the recent developments that establish the fundamental limits for community detection in the SBM, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery (a.k.a., detection). The main results discussed are the phase transitions for exact recovery at the Chernoff-Hellinger threshold, the phase transition for weak recovery at the Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial recovery, the learning of the SBM parameters and the gap between information-theoretic and computational thresholds. The note also covers some of the algorithms developed in the quest of achieving the limits, in particular two-round algorithms via graph-splitting, semi-definite programming, linearized belief propagation, classical and nonbacktracking spectral methods. A few open problems are also discussed

    How robust are reconstruction thresholds for community detection?

    Get PDF
    The stochastic block model is one of the oldest and most ubiquitous models for studying clustering and community detection. In an exciting sequence of developments, motivated by deep but non-rigorous ideas from statistical physics, Decelle et al. conjectured a sharp threshold for when community detection is possible in the sparse regime. Mossel, Neeman and Sly and Massoulié proved the conjecture and gave matching algorithms and lower bounds. Here we revisit the stochastic block model from the perspective of semirandom models where we allow an adversary to make 'helpful' changes that strengthen ties within each community and break ties between them. We show a surprising result that these 'helpful' changes can shift the information-theoretic threshold, making the community detection problem strictly harder. We complement this by showing that an algorithm based on semidefinite programming (which was known to get close to the threshold) continues to work in the semirandom model (even for partial recovery). This suggests that algorithms based on semidefinite programming are robust in ways that any algorithm meeting the information-theoretic threshold cannot be. These results point to an interesting new direction: Can we find robust, semirandom analogues to some of the classical, average-case thresholds in statistics? We also explore this question in the broadcast tree model, and we show that the viewpoint of semirandom models can help explain why some algorithms are preferred to others in practice, in spite of the gaps in their statistical performance on random models.National Science Foundation (U.S.). Faculty Early Career Development Program (Award CCF-1453261)Google Faculty Research AwardNihon Denki Kabushiki Kaish

    Learning Communities in the Presence of Errors

    No full text
    I will talk about learning hidden communities in the presence of modeling errors in the Stochastic Block Model. This model, which is also known as the Planted Partition Model, is widely used for community detection and graph partitioning in various fields, including machine learning, statistics, and social sciences. We consider graphs generated according to the Stochastic Block Model and then modified by an adversary. We allow two types of adversarial errors, Feige—Kilian or monotone errors, and edge outlier errors. In the talk, I will present a simple SDP-based algorithm for finding planted communities in sparse graphs from the Stochastic Block Model and prove that this algorithm is robust to adversarial errors. The talk is based on a joint work with Yury Makarychev and Aravindan Vijayaraghavan (COLT 2016).Non UBCUnreviewedAuthor affiliation: Northwestern UniversityFacult