12,275 research outputs found

    Sparse random graphs: regularization and concentration of the Laplacian

    Full text link
    We study random graphs with possibly different edge probabilities in the challenging sparse regime of bounded expected degrees. Unlike in the dense case, neither the graph adjacency matrix nor its Laplacian concentrate around their expectations due to the highly irregular distribution of node degrees. It has been empirically observed that simply adding a constant of order 1/n1/n to each entry of the adjacency matrix substantially improves the behavior of Laplacian. Here we prove that this regularization indeed forces Laplacian to concentrate even in sparse graphs. As an immediate consequence in network analysis, we establish the validity of one of the simplest and fastest approaches to community detection -- regularized spectral clustering, under the stochastic block model. Our proof of concentration of regularized Laplacian is based on Grothendieck's inequality and factorization, combined with paving arguments.Comment: Added reference

    Accounting for the Role of Long Walks on Networks via a New Matrix Function

    Get PDF
    We introduce a new matrix function for studying graphs and real-world networks based on a double-factorial penalization of walks between nodes in a graph. This new matrix function is based on the matrix error function. We find a very good approximation of this function using a matrix hyperbolic tangent function. We derive a communicability function, a subgraph centrality and a double-factorial Estrada index based on this new matrix function. We obtain upper and lower bounds for the double-factorial Estrada index of graphs, showing that they are similar to those of the single-factorial Estrada index. We then compare these indices with the single-factorial one for simple graphs and real-world networks. We conclude that for networks containing chordless cycles---holes---the two penalization schemes produce significantly different results. In particular, we study two series of real-world networks representing urban street networks, and protein residue networks. We observe that the subgraph centrality based on both indices produce significantly different ranking of the nodes. The use of the double factorial penalization of walks opens new possibilities for studying important structural properties of real-world networks where long-walks play a fundamental role, such as the cases of networks containing chordless cycles

    Penalized Likelihood Methods for Estimation of Sparse High Dimensional Directed Acyclic Graphs

    Full text link
    Directed acyclic graphs (DAGs) are commonly used to represent causal relationships among random variables in graphical models. Applications of these models arise in the study of physical, as well as biological systems, where directed edges between nodes represent the influence of components of the system on each other. The general problem of estimating DAGs from observed data is computationally NP-hard, Moreover two directed graphs may be observationally equivalent. When the nodes exhibit a natural ordering, the problem of estimating directed graphs reduces to the problem of estimating the structure of the network. In this paper, we propose a penalized likelihood approach that directly estimates the adjacency matrix of DAGs. Both lasso and adaptive lasso penalties are considered and an efficient algorithm is proposed for estimation of high dimensional DAGs. We study variable selection consistency of the two penalties when the number of variables grows to infinity with the sample size. We show that although lasso can only consistently estimate the true network under stringent assumptions, adaptive lasso achieves this task under mild regularity conditions. The performance of the proposed methods is compared to alternative methods in simulated, as well as real, data examples.Comment: 19 pages, 8 figure
    • …
    corecore