Search CORE

34 research outputs found

The minimum bisection in the planted bisection model

Author: Coja-Oghlan Amin
Cooley Oliver
Kang Mihyun
Skubch Kathrin
Publication venue
Publication date: 01/01/2015
Field of study

In the planted bisection model a random graph

G(n,p_+,p_- )

with

n

vertices is created by partitioning the vertices randomly into two classes of equal size (up to

\pm1

). Any two vertices that belong to the same class are linked by an edge with probability

p_+

and any two that belong to different classes with probability

p_- <p_+

independently. The planted bisection model has been used extensively to benchmark graph partitioning algorithms. If

p_{\pm} =2d_{\pm} /n

for numbers

0\leq d_- <d_+

that remain fixed as

n\to\infty

, then w.h.p. the ``planted'' bisection (the one used to construct the graph) will not be a minimum bisection. In this paper we derive an asymptotic formula for the minimum bisection width under the assumption that

d_+ -d_- >c\sqrt{d_+ \ln d_+ }

for a certain constant

c>0

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

The Geometric Block Model

Author: Galhotra Sainyam
Mazumdar Arya
Pal Soumyabrata
Saha Barna
Publication venue
Publication date: 24/01/2018
Field of study

To capture the inherent geometric features of many community detection problems, we propose to use a new random graph model of communities that we call a Geometric Block Model. The geometric block model generalizes the random geometric graphs in the same way that the well-studied stochastic block model generalizes the Erdos-Renyi random graphs. It is also a natural extension of random community models inspired by the recent theoretical and practical advancement in community detection. While being a topic of fundamental theoretical interest, our main contribution is to show that many practical community structures are better explained by the geometric block model. We also show that a simple triangle-counting algorithm to detect communities in the geometric block model is near-optimal. Indeed, even in the regime where the average degree of the graph grows only logarithmically with the number of vertices (sparse-graph), we show that this algorithm performs extremely well, both theoretically and practically. In contrast, the triangle-counting algorithm is far from being optimum for the stochastic block model. We simulate our results on both real and synthetic datasets to show superior performance of both the new model as well as our algorithm.Comment: A shorter version of this paper has appeared in 32nd AAAI Conference on Artificial Intelligence. The AAAI proceedings version as well as the previous version in arxiv contained some errors that have been corrected in this versio

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Consistency Thresholds for the Planted Bisection Model

Author: Mossel Elchanan
Neeman Joe
Sly Allan
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 25/11/2019
Field of study

The planted bisection model is a random graph model in which the nodes are divided into two equal-sized communities and then edges are added randomly in a way that depends on the community membership. We establish necessary and sufficient conditions for the asymptotic recoverability of the planted bisection in this model. When the bisection is asymptotically recoverable, we give an efficient algorithm that successfully recovers it. We also show that the planted bisection is recoverable asymptotically if and only if with high probability every node belongs to the same community as the majority of its neighbors. Our algorithm for finding the planted bisection runs in time almost linear in the number of edges. It has three stages: spectral clustering to compute an initial guess, a "replica" stage to get almost every vertex correct, and then some simple local moves to finish the job. An independent work by Abbe, Bandeira, and Hall establishes similar (slightly weaker) results but only in the case of logarithmic average degree.Comment: latest version contains an erratum, addressing an error pointed out by Jan van Waai

arXiv.org e-Print Archive

The Australian National University

Stochastic Block Model and Community Detection in the Sparse Graphs: A spectral algorithm with optimal rate of recovery

Author: Chin Peter
Rao Anup
Vu Van
Publication venue
Publication date: 01/01/2015
Field of study

In this paper, we present and analyze a simple and robust spectral algorithm for the stochastic block model with

k

blocks, for any

k

fixed. Our algorithm works with graphs having constant edge density, under an optimal condition on the gap between the density inside a block and the density between the blocks. As a co-product, we settle an open question posed by Abbe et. al. concerning censor block models

arXiv.org e-Print Archive

CiteSeerX

Recovery, detection and confidence sets of communities in a sparse stochastic block model

Author: Kleijn B. J. K.
van Waaij J.
Publication venue
Publication date: 22/10/2018
Field of study

Posterior distributions for community assignment in the planted bi-section model are shown to achieve frequentist exact recovery and detection under sharp lower bounds on sparsity. Assuming posterior recovery (or detection), one may interpret credible sets (or enlarged credible sets) as consistent confidence sets. If credible levels grow to one quickly enough, credible sets can be interpreted as frequentist confidence sets without conditions on the parameters. In the regime where within-class and between-class edge-probabilities are very close, credible sets may be enlarged to achieve frequentist asymptotic coverage. The diameters of credible sets are controlled and match rates of posterior convergence.Comment: 22 pp., 2 fi

arXiv.org e-Print Archive

Online Research Database In Technology

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Global and Local Information in Clustering Labeled Block Models

Author: Kanade Varun
Mossel Elchanan
Schramm Tselil
Publication venue
Publication date: 01/01/2014
Field of study

The stochastic block model is a classical cluster-exhibiting random graph model that has been widely studied in statistics, physics and computer science. In its simplest form, the model is a random graph with two equal-sized clusters, with intra-cluster edge probability p, and inter-cluster edge probability q. We focus on the sparse case, i.e., p, q = O(1/n), which is practically more relevant and also mathematically more challenging. A conjecture of Decelle, Krzakala, Moore and Zdeborova, based on ideas from statistical physics, predicted a specific threshold for clustering. The negative direction of the conjecture was proved by Mossel, Neeman and Sly (2012), and more recently the positive direction was proven independently by Massoulie and Mossel, Neeman, and Sly. In many real network clustering problems, nodes contain information as well. We study the interplay between node and network information in clustering by studying a labeled block model, where in addition to the edge information, the true cluster labels of a small fraction of the nodes are revealed. In the case of two clusters, we show that below the threshold, a small amount of node information does not affect recovery. On the other hand, we show that for any small amount of information efficient local clustering is achievable as long as the number of clusters is sufficiently large (as a function of the amount of revealed information).Comment: 24 pages, 2 figures. A short abstract describing these results will appear in proceedings of RANDOM 201

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server