662 research outputs found
Exact Recovery for a Family of Community-Detection Generative Models
Generative models for networks with communities have been studied extensively
for being a fertile ground to establish information-theoretic and computational
thresholds. In this paper we propose a new toy model for planted generative
models called planted Random Energy Model (REM), inspired by Derrida's REM. For
this model we provide the asymptotic behaviour of the probability of error for
the maximum likelihood estimator and hence the exact recovery threshold. As an
application, we further consider the 2 non-equally sized community Weighted
Stochastic Block Model (2-WSBM) on -uniform hypergraphs, that is equivalent
to the P-REM on both sides of the spectrum, for high and low edge cardinality
. We provide upper and lower bounds for the exact recoverability for any
, mapping these problems to the aforementioned P-REM. To the best of our
knowledge these are the first consistency results for the 2-WSBM on graphs and
on hypergraphs with non-equally sized community
Community Detection in Hypergraphs, Spiked Tensor Models, and Sum-of-Squares
We study the problem of community detection in hypergraphs under a stochastic
block model. Similarly to how the stochastic block model in graphs suggests
studying spiked random matrices, our model motivates investigating statistical
and computational limits of exact recovery in a certain spiked tensor model. In
contrast with the matrix case, the spiked model naturally arising from
community detection in hypergraphs is different from the one arising in the
so-called tensor Principal Component Analysis model. We investigate the
effectiveness of algorithms in the Sum-of-Squares hierarchy on these models.
Interestingly, our results suggest that these two apparently similar models
exhibit significantly different computational to statistical gaps.Comment: In proceedings of 2017 International Conference on Sampling Theory
and Applications (SampTA
A framework to generate hypergraphs with community structure
In recent years hypergraphs have emerged as a powerful tool to study systems
with multi-body interactions which cannot be trivially reduced to pairs. While
highly structured methods to generate synthetic data have proved fundamental
for the standardized evaluation of algorithms and the statistical study of
real-world networked data, these are scarcely available in the context of
hypergraphs. Here we propose a flexible and efficient framework for the
generation of hypergraphs with many nodes and large hyperedges, which allows
specifying general community structures and tune different local statistics. We
illustrate how to use our model to sample synthetic data with desired features
(assortative or disassortative communities, mixed or hard community
assignments, etc.), analyze community detection algorithms, and generate
hypergraphs structurally similar to real-world data. Overcoming previous
limitations on the generation of synthetic hypergraphs, our work constitutes a
substantial advancement in the statistical modeling of higher-order systems.Comment: 18 pages, 8 figures, revised versio
- …