7,735 research outputs found
Sparse random graphs: regularization and concentration of the Laplacian
We study random graphs with possibly different edge probabilities in the
challenging sparse regime of bounded expected degrees. Unlike in the dense
case, neither the graph adjacency matrix nor its Laplacian concentrate around
their expectations due to the highly irregular distribution of node degrees. It
has been empirically observed that simply adding a constant of order to
each entry of the adjacency matrix substantially improves the behavior of
Laplacian. Here we prove that this regularization indeed forces Laplacian to
concentrate even in sparse graphs. As an immediate consequence in network
analysis, we establish the validity of one of the simplest and fastest
approaches to community detection -- regularized spectral clustering, under the
stochastic block model. Our proof of concentration of regularized Laplacian is
based on Grothendieck's inequality and factorization, combined with paving
arguments.Comment: Added reference
Concentration of random graphs and application to community detection
Random matrix theory has played an important role in recent work on
statistical network analysis. In this paper, we review recent results on
regimes of concentration of random graphs around their expectation, showing
that dense graphs concentrate and sparse graphs concentrate after
regularization. We also review relevant network models that may be of interest
to probabilists considering directions for new random matrix theory
developments, and random matrix theory tools that may be of interest to
statisticians looking to prove properties of network algorithms. Applications
of concentration results to the problem of community detection in networks are
discussed in detail.Comment: Submission for International Congress of Mathematicians, Rio de
Janeiro, Brazil 201
Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models
A challenging problem in estimating high-dimensional graphical models is to
choose the regularization parameter in a data-dependent way. The standard
techniques include -fold cross-validation (-CV), Akaike information
criterion (AIC), and Bayesian information criterion (BIC). Though these methods
work well for low-dimensional problems, they are not suitable in high
dimensional settings. In this paper, we present StARS: a new stability-based
method for choosing the regularization parameter in high dimensional inference
for undirected graphs. The method has a clear interpretation: we use the least
amount of regularization that simultaneously makes a graph sparse and
replicable under random sampling. This interpretation requires essentially no
conditions. Under mild conditions, we show that StARS is partially sparsistent
in terms of graph estimation: i.e. with high probability, all the true edges
will be included in the selected model even when the graph size diverges with
the sample size. Empirically, the performance of StARS is compared with the
state-of-the-art model selection procedures, including -CV, AIC, and BIC, on
both synthetic data and a real microarray dataset. StARS outperforms all these
competing procedures
- …