1,140 research outputs found

    Phase Transitions in Semidefinite Relaxations

    Full text link
    Statistical inference problems arising within signal processing, data mining, and machine learning naturally give rise to hard combinatorial optimization problems. These problems become intractable when the dimensionality of the data is large, as is often the case for modern datasets. A popular idea is to construct convex relaxations of these combinatorial problems, which can be solved efficiently for large scale datasets. Semidefinite programming (SDP) relaxations are among the most powerful methods in this family, and are surprisingly well-suited for a broad range of problems where data take the form of matrices or graphs. It has been observed several times that, when the `statistical noise' is small enough, SDP relaxations correctly detect the underlying combinatorial structures. In this paper we develop asymptotic predictions for several `detection thresholds,' as well as for the estimation error above these thresholds. We study some classical SDP relaxations for statistical problems motivated by graph synchronization and community detection in networks. We map these optimization problems to statistical mechanics models with vector spins, and use non-rigorous techniques from statistical mechanics to characterize the corresponding phase transitions. Our results clarify the effectiveness of SDP relaxations in solving high-dimensional statistical problems.Comment: 71 pages, 24 pdf figure

    Deep Learning for Community Detection: Progress, Challenges and Opportunities

    Full text link
    As communities represent similar opinions, similar functions, similar purposes, etc., community detection is an important and extremely useful tool in both scientific inquiry and data analytics. However, the classic methods of community detection, such as spectral clustering and statistical inference, are falling by the wayside as deep learning techniques demonstrate an increasing capacity to handle high-dimensional graph data with impressive performance. Thus, a survey of current progress in community detection through deep learning is timely. Structured into three broad research streams in this domain - deep neural networks, deep graph embedding, and graph neural networks, this article summarizes the contributions of the various frameworks, models, and algorithms in each stream along with the current challenges that remain unsolved and the future research opportunities yet to be explored.Comment: Accepted Paper in the 29th International Joint Conference on Artificial Intelligence (IJCAI 20), Survey Trac

    Concentration of random graphs and application to community detection

    Full text link
    Random matrix theory has played an important role in recent work on statistical network analysis. In this paper, we review recent results on regimes of concentration of random graphs around their expectation, showing that dense graphs concentrate and sparse graphs concentrate after regularization. We also review relevant network models that may be of interest to probabilists considering directions for new random matrix theory developments, and random matrix theory tools that may be of interest to statisticians looking to prove properties of network algorithms. Applications of concentration results to the problem of community detection in networks are discussed in detail.Comment: Submission for International Congress of Mathematicians, Rio de Janeiro, Brazil 201

    A non-backtracking method for long matrix and tensor completion

    Full text link
    We consider the problem of low-rank rectangular matrix completion in the regime where the matrix MM of size n×mn\times m is ``long", i.e., the aspect ratio m/nm/n diverges to infinity. Such matrices are of particular interest in the study of tensor completion, where they arise from the unfolding of a low-rank tensor. In the case where the sampling probability is dmn\frac{d}{\sqrt{mn}}, we propose a new spectral algorithm for recovering the singular values and left singular vectors of the original matrix MM based on a variant of the standard non-backtracking operator of a suitably defined bipartite weighted random graph, which we call a \textit{non-backtracking wedge operator}. When dd is above a Kesten-Stigum-type sampling threshold, our algorithm recovers a correlated version of the singular value decomposition of MM with quantifiable error bounds. This is the first result in the regime of bounded dd for weak recovery and the first for weak consistency when d→∞d\to\infty arbitrarily slowly without any polylog factors. As an application, for low-rank orthogonal kk-tensor completion, we efficiently achieve weak recovery with sample size O(nk/2)O(n^{k/2}), and weak consistency with sample size ω(nk/2)\omega(n^{k/2})
    • …
    corecore