122 research outputs found
Super-resolution community detection for layer-aggregated multilayer networks
Applied network science often involves preprocessing network data before
applying a network-analysis method, and there is typically a theoretical
disconnect between these steps. For example, it is common to aggregate
time-varying network data into windows prior to analysis, and the tradeoffs of
this preprocessing are not well understood. Focusing on the problem of
detecting small communities in multilayer networks, we study the effects of
layer aggregation by developing random-matrix theory for modularity matrices
associated with layer-aggregated networks with nodes and layers, which
are drawn from an ensemble of Erd\H{o}s-R\'enyi networks. We study phase
transitions in which eigenvectors localize onto communities (allowing their
detection) and which occur for a given community provided its size surpasses a
detectability limit . When layers are aggregated via a summation, we
obtain , where is the number of
layers across which the community persists. Interestingly, if is allowed to
vary with then summation-based layer aggregation enhances small-community
detection even if the community persists across a vanishing fraction of layers,
provided that decays more slowly than . Moreover,
we find that thresholding the summation can in some cases cause to decay
exponentially, decreasing by orders of magnitude in a phenomenon we call
super-resolution community detection. That is, layer aggregation with
thresholding is a nonlinear data filter enabling detection of communities that
are otherwise too small to detect. Importantly, different thresholds generally
enhance the detectability of communities having different properties,
illustrating that community detection can be obscured if one analyzes network
data using a single threshold.Comment: 11 pages, 8 figure
Computational barriers in minimax submatrix detection
This paper studies the minimax detection of a small submatrix of elevated
mean in a large matrix contaminated by additive Gaussian noise. To investigate
the tradeoff between statistical performance and computational cost from a
complexity-theoretic perspective, we consider a sequence of discretized models
which are asymptotically equivalent to the Gaussian model. Under the hypothesis
that the planted clique detection problem cannot be solved in randomized
polynomial time when the clique size is of smaller order than the square root
of the graph size, the following phase transition phenomenon is established:
when the size of the large matrix , if the submatrix size
for any , computational complexity
constraints can incur a severe penalty on the statistical performance in the
sense that any randomized polynomial-time test is minimax suboptimal by a
polynomial factor in ; if for any
, minimax optimal detection can be attained within
constant factors in linear time. Using Schatten norm loss as a representative
example, we show that the hardness of attaining the minimax estimation rate can
crucially depend on the loss function. Implications on the hardness of support
recovery are also obtained.Comment: Published at http://dx.doi.org/10.1214/14-AOS1300 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A Tensor Approach to Learning Mixed Membership Community Models
Community detection is the task of detecting hidden communities from observed
interactions. Guaranteed community detection has so far been mostly limited to
models with non-overlapping communities such as the stochastic block model. In
this paper, we remove this restriction, and provide guaranteed community
detection for a family of probabilistic network models with overlapping
communities, termed as the mixed membership Dirichlet model, first introduced
by Airoldi et al. This model allows for nodes to have fractional memberships in
multiple communities and assumes that the community memberships are drawn from
a Dirichlet distribution. Moreover, it contains the stochastic block model as a
special case. We propose a unified approach to learning these models via a
tensor spectral decomposition method. Our estimator is based on low-order
moment tensor of the observed network, consisting of 3-star counts. Our
learning method is fast and is based on simple linear algebraic operations,
e.g. singular value decomposition and tensor power iterations. We provide
guaranteed recovery of community memberships and model parameters and present a
careful finite sample analysis of our learning method. As an important special
case, our results match the best known scaling requirements for the
(homogeneous) stochastic block model
- β¦