42 research outputs found
Graph Matching in Correlated Stochastic Block Models for Improved Graph Clustering
We consider community detection from multiple correlated graphs sharing the
same community structure. The correlated graphs are generated by independent
subsampling of a parent graph sampled from the stochastic block model. The
vertex correspondence between the correlated graphs is assumed to be unknown.
We consider the two-step procedure where the vertex correspondence between the
correlated graphs is first revealed, and the communities are recovered from the
union of the correlated graphs, which becomes denser than each single graph. We
derive the information-theoretic limits for exact graph matching in general
density regimes and the number of communities, and then analyze the regime of
graph parameters, where one can benefit from the matching of the correlated
graphs in recovering the latent community structure of the graphs.Comment: Allerton Conference 202
Exact Clustering of Weighted Graphs via Semidefinite Programming
As a model problem for clustering, we consider the densest k-disjoint-clique
problem of partitioning a weighted complete graph into k disjoint subgraphs
such that the sum of the densities of these subgraphs is maximized. We
establish that such subgraphs can be recovered from the solution of a
particular semidefinite relaxation with high probability if the input graph is
sampled from a distribution of clusterable graphs. Specifically, the
semidefinite relaxation is exact if the graph consists of k large disjoint
subgraphs, corresponding to clusters, with weight concentrated within these
subgraphs, plus a moderate number of outliers. Further, we establish that if
noise is weakly obscuring these clusters, i.e, the between-cluster edges are
assigned very small weights, then we can recover significantly smaller
clusters. For example, we show that in approximately sparse graphs, where the
between-cluster weights tend to zero as the size n of the graph tends to
infinity, we can recover clusters of size polylogarithmic in n. Empirical
evidence from numerical simulations is also provided to support these
theoretical phase transitions to perfect recovery of the cluster structure
A semidefinite program for unbalanced multisection in the stochastic block model
We propose a semidefinite programming (SDP) algorithm for community detection
in the stochastic block model, a popular model for networks with latent
community structure. We prove that our algorithm achieves exact recovery of the
latent communities, up to the information-theoretic limits determined by Abbe
and Sandon (2015). Our result extends prior SDP approaches by allowing for many
communities of different sizes. By virtue of a semidefinite approach, our
algorithms succeed against a semirandom variant of the stochastic block model,
guaranteeing a form of robustness and generalization. We further explore how
semirandom models can lend insight into both the strengths and limitations of
SDPs in this setting.Comment: 29 page