22 research outputs found
Deterministic Approximation of Random Walks in Small Space
We give a deterministic, nearly logarithmic-space algorithm that given an undirected graph G, a positive integer r, and a set S of vertices, approximates the conductance of S in the r-step random walk on G to within a factor of 1+epsilon, where epsilon>0 is an arbitrarily small constant. More generally, our algorithm computes an epsilon-spectral approximation to the normalized Laplacian of the r-step walk.
Our algorithm combines the derandomized square graph operation [Eyal Rozenman and Salil Vadhan, 2005], which we recently used for solving Laplacian systems in nearly logarithmic space [Murtagh et al., 2017], with ideas from [Cheng et al., 2015], which gave an algorithm that is time-efficient (while ours is space-efficient) and randomized (while ours is deterministic) for the case of even r (while ours works for all r). Along the way, we provide some new results that generalize technical machinery and yield improvements over previous work. First, we obtain a nearly linear-time randomized algorithm for computing a spectral approximation to the normalized Laplacian for odd r. Second, we define and analyze a generalization of the derandomized square for irregular graphs and for sparsifying the product of two distinct graphs. As part of this generalization, we also give a strongly explicit construction of expander graphs of every size
Improved large-scale graph learning through ridge spectral sparsification
International audienceThe representation and learning benefits of methods based on graph Laplacians, such as Lapla-cian smoothing or harmonic function solution for semi-supervised learning (SSL), are empirically and theoretically well supported. Nonetheless , the exact versions of these methods scale poorly with the number of nodes n of the graph. In this paper, we combine a spectral sparsifica-tion routine with Laplacian learning. Given a graph G as input, our algorithm computes a sparsi-fier in a distributed way in O(n log 3 (n)) time, O(m log 3 (n)) work and O(n log(n)) memory, using only log(n) rounds of communication. Furthermore , motivated by the regularization often employed in learning algorithms, we show that constructing sparsifiers that preserve the spectrum of the Laplacian only up to the regularization level may drastically reduce the size of the final graph. By constructing a spectrally-similar graph, we are able to bound the error induced by the sparsifica-tion for a variety of downstream tasks (e.g., SSL). We empirically validate the theoretical guarantees on Amazon co-purchase graph and compare to the state-of-the-art heuristics
Linear algebraic techniques in theoretical computer science and population genetics
Thesis (Ph. D.)--Massachusetts Institute of Technology, Department of Mathematics, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (pages 149-155).In this thesis, we present several algorithmic results for problems in spectral graph theory and computational biology. The first part concerns the problem of spectral sparsification. It is known that every dense graph can be approximated in a strong sense by a sparse subgraph, known as a spectral sparsifier of the graph. Furthermore, researchers have recently developed efficient algorithms for computing such approximations. We show how to make these algorithms faster, and also give a substantial improvement in space efficiency. Since sparsification is an important first step in speeding up approximation algorithms for many graph problems, our results have numerous applications. In the second part of the thesis, we consider the problem of inferring human population history from genetic data. We give an efficient and principled algorithm for using single nucleotide polymorphism (SNP) data to infer admixture history of various populations, and apply it to show that Europeans have evidence of mixture with ancient Siberians. Finally, we turn to the problem of RNA secondary structure design. In this problem, we want to find RNA sequences that fold to a given secondary structure. We propose a novel global sampling approach, based on the recently developed RNAmutants algorithm, and show that it has numerous desirable properties when compared to existing solutions. Our method can prove useful for developing the next generation of RNA design algorithms.by Alex Levin.Ph.D
Sampling Random Spanning Trees Faster than Matrix Multiplication
We present an algorithm that, with high probability, generates a random
spanning tree from an edge-weighted undirected graph in
time (The notation hides
factors). The tree is sampled from a distribution
where the probability of each tree is proportional to the product of its edge
weights. This improves upon the previous best algorithm due to Colbourn et al.
that runs in matrix multiplication time, . For the special case of
unweighted graphs, this improves upon the best previously known running time of
for (Colbourn
et al. '96, Kelner-Madry '09, Madry et al. '15).
The effective resistance metric is essential to our algorithm, as in the work
of Madry et al., but we eschew determinant-based and random walk-based
techniques used by previous algorithms. Instead, our algorithm is based on
Gaussian elimination, and the fact that effective resistance is preserved in
the graph resulting from eliminating a subset of vertices (called a Schur
complement). As part of our algorithm, we show how to compute
-approximate effective resistances for a set of vertex pairs via
approximate Schur complements in time,
without using the Johnson-Lindenstrauss lemma which requires time. We
combine this approximation procedure with an error correction procedure for
handing edges where our estimate isn't sufficiently accurate
Kirchhoff Index As a Measure of Edge Centrality in Weighted Networks: Nearly Linear Time Algorithms
Most previous work of centralities focuses on metrics of vertex importance
and methods for identifying powerful vertices, while related work for edges is
much lesser, especially for weighted networks, due to the computational
challenge. In this paper, we propose to use the well-known Kirchhoff index as
the measure of edge centrality in weighted networks, called -Kirchhoff
edge centrality. The Kirchhoff index of a network is defined as the sum of
effective resistances over all vertex pairs. The centrality of an edge is
reflected in the increase of Kirchhoff index of the network when the edge
is partially deactivated, characterized by a parameter . We define two
equivalent measures for -Kirchhoff edge centrality. Both are global
metrics and have a better discriminating power than commonly used measures,
based on local or partial structural information of networks, e.g. edge
betweenness and spanning edge centrality.
Despite the strong advantages of Kirchhoff index as a centrality measure and
its wide applications, computing the exact value of Kirchhoff edge centrality
for each edge in a graph is computationally demanding. To solve this problem,
for each of the -Kirchhoff edge centrality metrics, we present an
efficient algorithm to compute its -approximation for all the
edges in nearly linear time in . The proposed -Kirchhoff edge
centrality is the first global metric of edge importance that can be provably
approximated in nearly-linear time. Moreover, according to the
-Kirchhoff edge centrality, we present a -Kirchhoff vertex
centrality measure, as well as a fast algorithm that can compute
-approximate Kirchhoff vertex centrality for all the vertices in
nearly linear time in