73 research outputs found
Covariance regularization by thresholding
This paper considers regularizing a covariance matrix of variables
estimated from observations, by hard thresholding. We show that the
thresholded estimate is consistent in the operator norm as long as the true
covariance matrix is sparse in a suitable sense, the variables are Gaussian or
sub-Gaussian, and , and obtain explicit rates. The results are
uniform over families of covariance matrices which satisfy a fairly natural
notion of sparsity. We discuss an intuitive resampling scheme for threshold
selection and prove a general cross-validation result that justifies this
approach. We also compare thresholding to other covariance estimators in
simulations and on an example from climate data.Comment: Published in at http://dx.doi.org/10.1214/08-AOS600 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Optimization via Low-rank Approximation for Community Detection in Networks
Community detection is one of the fundamental problems of network analysis,
for which a number of methods have been proposed. Most model-based or
criteria-based methods have to solve an optimization problem over a discrete
set of labels to find communities, which is computationally infeasible. Some
fast spectral algorithms have been proposed for specific methods or models, but
only on a case-by-case basis. Here we propose a general approach for maximizing
a function of a network adjacency matrix over discrete labels by projecting the
set of labels onto a subspace approximating the leading eigenvectors of the
expected adjacency matrix. This projection onto a low-dimensional space makes
the feasible set of labels much smaller and the optimization problem much
easier. We prove a general result about this method and show how to apply it to
several previously proposed community detection criteria, establishing its
consistency for label estimation in each case and demonstrating the fundamental
connection between spectral properties of the network and various model-based
approaches to community detection. Simulations and applications to real-world
data are included to demonstrate our method performs well for multiple problems
over a wide range of parameters.Comment: 45 pages, 7 figures; added discussions about computational complexity
and extension to more than two communitie
The method of moments and degree distributions for network models
Probability models on graphs are becoming increasingly important in many
applications, but statistical tools for fitting such models are not yet well
developed. Here we propose a general method of moments approach that can be
used to fit a large class of probability models through empirical counts of
certain patterns in a graph. We establish some general asymptotic properties of
empirical graph moments and prove consistency of the estimates as the graph
size grows for all ranges of the average degree including .
Additional results are obtained for the important special case of degree
distributions.Comment: Published in at http://dx.doi.org/10.1214/11-AOS904 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …