22 research outputs found

    Deterministic Approximation of Random Walks in Small Space

    Get PDF
    We give a deterministic, nearly logarithmic-space algorithm that given an undirected graph G, a positive integer r, and a set S of vertices, approximates the conductance of S in the r-step random walk on G to within a factor of 1+epsilon, where epsilon>0 is an arbitrarily small constant. More generally, our algorithm computes an epsilon-spectral approximation to the normalized Laplacian of the r-step walk. Our algorithm combines the derandomized square graph operation [Eyal Rozenman and Salil Vadhan, 2005], which we recently used for solving Laplacian systems in nearly logarithmic space [Murtagh et al., 2017], with ideas from [Cheng et al., 2015], which gave an algorithm that is time-efficient (while ours is space-efficient) and randomized (while ours is deterministic) for the case of even r (while ours works for all r). Along the way, we provide some new results that generalize technical machinery and yield improvements over previous work. First, we obtain a nearly linear-time randomized algorithm for computing a spectral approximation to the normalized Laplacian for odd r. Second, we define and analyze a generalization of the derandomized square for irregular graphs and for sparsifying the product of two distinct graphs. As part of this generalization, we also give a strongly explicit construction of expander graphs of every size

    Improved large-scale graph learning through ridge spectral sparsification

    Get PDF
    International audienceThe representation and learning benefits of methods based on graph Laplacians, such as Lapla-cian smoothing or harmonic function solution for semi-supervised learning (SSL), are empirically and theoretically well supported. Nonetheless , the exact versions of these methods scale poorly with the number of nodes n of the graph. In this paper, we combine a spectral sparsifica-tion routine with Laplacian learning. Given a graph G as input, our algorithm computes a sparsi-fier in a distributed way in O(n log 3 (n)) time, O(m log 3 (n)) work and O(n log(n)) memory, using only log(n) rounds of communication. Furthermore , motivated by the regularization often employed in learning algorithms, we show that constructing sparsifiers that preserve the spectrum of the Laplacian only up to the regularization level may drastically reduce the size of the final graph. By constructing a spectrally-similar graph, we are able to bound the error induced by the sparsifica-tion for a variety of downstream tasks (e.g., SSL). We empirically validate the theoretical guarantees on Amazon co-purchase graph and compare to the state-of-the-art heuristics

    Linear algebraic techniques in theoretical computer science and population genetics

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Department of Mathematics, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (pages 149-155).In this thesis, we present several algorithmic results for problems in spectral graph theory and computational biology. The first part concerns the problem of spectral sparsification. It is known that every dense graph can be approximated in a strong sense by a sparse subgraph, known as a spectral sparsifier of the graph. Furthermore, researchers have recently developed efficient algorithms for computing such approximations. We show how to make these algorithms faster, and also give a substantial improvement in space efficiency. Since sparsification is an important first step in speeding up approximation algorithms for many graph problems, our results have numerous applications. In the second part of the thesis, we consider the problem of inferring human population history from genetic data. We give an efficient and principled algorithm for using single nucleotide polymorphism (SNP) data to infer admixture history of various populations, and apply it to show that Europeans have evidence of mixture with ancient Siberians. Finally, we turn to the problem of RNA secondary structure design. In this problem, we want to find RNA sequences that fold to a given secondary structure. We propose a novel global sampling approach, based on the recently developed RNAmutants algorithm, and show that it has numerous desirable properties when compared to existing solutions. Our method can prove useful for developing the next generation of RNA design algorithms.by Alex Levin.Ph.D

    Sampling Random Spanning Trees Faster than Matrix Multiplication

    Full text link
    We present an algorithm that, with high probability, generates a random spanning tree from an edge-weighted undirected graph in O~(n4/3m1/2+n2)\tilde{O}(n^{4/3}m^{1/2}+n^{2}) time (The O~()\tilde{O}(\cdot) notation hides polylog(n)\operatorname{polylog}(n) factors). The tree is sampled from a distribution where the probability of each tree is proportional to the product of its edge weights. This improves upon the previous best algorithm due to Colbourn et al. that runs in matrix multiplication time, O(nω)O(n^\omega). For the special case of unweighted graphs, this improves upon the best previously known running time of O~(min{nω,mn,m4/3})\tilde{O}(\min\{n^{\omega},m\sqrt{n},m^{4/3}\}) for mn5/3m \gg n^{5/3} (Colbourn et al. '96, Kelner-Madry '09, Madry et al. '15). The effective resistance metric is essential to our algorithm, as in the work of Madry et al., but we eschew determinant-based and random walk-based techniques used by previous algorithms. Instead, our algorithm is based on Gaussian elimination, and the fact that effective resistance is preserved in the graph resulting from eliminating a subset of vertices (called a Schur complement). As part of our algorithm, we show how to compute ϵ\epsilon-approximate effective resistances for a set SS of vertex pairs via approximate Schur complements in O~(m+(n+S)ϵ2)\tilde{O}(m+(n + |S|)\epsilon^{-2}) time, without using the Johnson-Lindenstrauss lemma which requires O~(min{(m+S)ϵ2,m+nϵ4+Sϵ2})\tilde{O}( \min\{(m + |S|)\epsilon^{-2}, m+n\epsilon^{-4} +|S|\epsilon^{-2}\}) time. We combine this approximation procedure with an error correction procedure for handing edges where our estimate isn't sufficiently accurate

    Kirchhoff Index As a Measure of Edge Centrality in Weighted Networks: Nearly Linear Time Algorithms

    Full text link
    Most previous work of centralities focuses on metrics of vertex importance and methods for identifying powerful vertices, while related work for edges is much lesser, especially for weighted networks, due to the computational challenge. In this paper, we propose to use the well-known Kirchhoff index as the measure of edge centrality in weighted networks, called θ\theta-Kirchhoff edge centrality. The Kirchhoff index of a network is defined as the sum of effective resistances over all vertex pairs. The centrality of an edge ee is reflected in the increase of Kirchhoff index of the network when the edge ee is partially deactivated, characterized by a parameter θ\theta. We define two equivalent measures for θ\theta-Kirchhoff edge centrality. Both are global metrics and have a better discriminating power than commonly used measures, based on local or partial structural information of networks, e.g. edge betweenness and spanning edge centrality. Despite the strong advantages of Kirchhoff index as a centrality measure and its wide applications, computing the exact value of Kirchhoff edge centrality for each edge in a graph is computationally demanding. To solve this problem, for each of the θ\theta-Kirchhoff edge centrality metrics, we present an efficient algorithm to compute its ϵ\epsilon-approximation for all the mm edges in nearly linear time in mm. The proposed θ\theta-Kirchhoff edge centrality is the first global metric of edge importance that can be provably approximated in nearly-linear time. Moreover, according to the θ\theta-Kirchhoff edge centrality, we present a θ\theta-Kirchhoff vertex centrality measure, as well as a fast algorithm that can compute ϵ\epsilon-approximate Kirchhoff vertex centrality for all the nn vertices in nearly linear time in mm
    corecore