94 research outputs found
Covariance regularization by thresholding
This paper considers regularizing a covariance matrix of variables
estimated from observations, by hard thresholding. We show that the
thresholded estimate is consistent in the operator norm as long as the true
covariance matrix is sparse in a suitable sense, the variables are Gaussian or
sub-Gaussian, and , and obtain explicit rates. The results are
uniform over families of covariance matrices which satisfy a fairly natural
notion of sparsity. We discuss an intuitive resampling scheme for threshold
selection and prove a general cross-validation result that justifies this
approach. We also compare thresholding to other covariance estimators in
simulations and on an example from climate data.Comment: Published in at http://dx.doi.org/10.1214/08-AOS600 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Optimization via Low-rank Approximation for Community Detection in Networks
Community detection is one of the fundamental problems of network analysis,
for which a number of methods have been proposed. Most model-based or
criteria-based methods have to solve an optimization problem over a discrete
set of labels to find communities, which is computationally infeasible. Some
fast spectral algorithms have been proposed for specific methods or models, but
only on a case-by-case basis. Here we propose a general approach for maximizing
a function of a network adjacency matrix over discrete labels by projecting the
set of labels onto a subspace approximating the leading eigenvectors of the
expected adjacency matrix. This projection onto a low-dimensional space makes
the feasible set of labels much smaller and the optimization problem much
easier. We prove a general result about this method and show how to apply it to
several previously proposed community detection criteria, establishing its
consistency for label estimation in each case and demonstrating the fundamental
connection between spectral properties of the network and various model-based
approaches to community detection. Simulations and applications to real-world
data are included to demonstrate our method performs well for multiple problems
over a wide range of parameters.Comment: 45 pages, 7 figures; added discussions about computational complexity
and extension to more than two communitie
Community extraction for social networks
Analysis of networks and in particular discovering communities within
networks has been a focus of recent work in several fields, with applications
ranging from citation and friendship networks to food webs and gene regulatory
networks. Most of the existing community detection methods focus on
partitioning the entire network into communities, with the expectation of many
ties within communities and few ties between. However, many networks contain
nodes that do not fit in with any of the communities, and forcing every node
into a community can distort results. Here we propose a new framework that
focuses on community extraction instead of partition, extracting one community
at a time. The main idea behind extraction is that the strength of a community
should not depend on ties between members of other communities, but only on
ties within that community and its ties to the outside world. We show that the
new extraction criterion performs well on simulated and real networks, and
establish asymptotic consistency of our method under the block model
assumption
- …
