3,041 research outputs found
Overlapping stochastic block models with application to the French political blogosphere
Complex systems in nature and in society are often represented as networks,
describing the rich set of interactions between objects of interest. Many
deterministic and probabilistic clustering methods have been developed to
analyze such structures. Given a network, almost all of them partition the
vertices into disjoint clusters, according to their connection profile.
However, recent studies have shown that these techniques were too restrictive
and that most of the existing networks contained overlapping clusters. To
tackle this issue, we present in this paper the Overlapping Stochastic Block
Model. Our approach allows the vertices to belong to multiple clusters, and, to
some extent, generalizes the well-known Stochastic Block Model [Nowicki and
Snijders (2001)]. We show that the model is generically identifiable within
classes of equivalence and we propose an approximate inference procedure, based
on global and local variational techniques. Using toy data sets as well as the
French Political Blogosphere network and the transcriptional network of
Saccharomyces cerevisiae, we compare our work with other approaches.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS382 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
On sparsity, power-law and clustering properties of graphex processes
This paper investigates properties of the class of graphs based on
exchangeable point processes. We provide asymptotic expressions for the number
of edges, number of nodes and degree distributions, identifying four regimes:
(i) a dense regime, (ii) a sparse almost dense regime, (iii) a sparse regime
with power-law behaviour, and (iv) an almost extremely sparse regime. We show
that under mild assumptions, both the global and local clustering coefficients
converge to constants which may or may not be the same. We also derive a
central limit theorem for the number of nodes. Finally, we propose a class of
models within this framework where one can separately control the latent
structure and the global sparsity/power-law properties of the graph
- …