    Probabilistic Spectral Sparsification In Sublinear Time

    In this paper, we introduce a variant of spectral sparsification, called probabilistic (ε,δ)(\varepsilon,\delta)-spectral sparsification. Roughly speaking, it preserves the cut value of any cut (S,Sc)(S,S^{c}) with an 1±ε1\pm\varepsilon multiplicative error and a δS\delta\left|S\right| additive error. We show how to produce a probabilistic (ε,δ)(\varepsilon,\delta)-spectral sparsifier with O(nlogn/ε2)O(n\log n/\varepsilon^{2}) edges in time O~(n/ε2δ)\tilde{O}(n/\varepsilon^{2}\delta) time for unweighted undirected graph. This gives fastest known sub-linear time algorithms for different cut problems on unweighted undirected graph such as - An O~(n/OPT+n3/2+t)\tilde{O}(n/OPT+n^{3/2+t}) time O(logn/t)O(\sqrt{\log n/t})-approximation algorithm for the sparsest cut problem and the balanced separator problem. - A n1+o(1)/ε4n^{1+o(1)}/\varepsilon^{4} time approximation minimum s-t cut algorithm with an εn\varepsilon n additive error

    Minimum Cuts in Near-Linear Time

    We significantly improve known time bounds for solving the minimum cut problem on undirected graphs. We use a ``semi-duality'' between minimum cuts and maximum spanning tree packings combined with our previously developed random sampling techniques. We give a randomized algorithm that finds a minimum cut in an m-edge, n-vertex graph with high probability in O(m log^3 n) time. We also give a simpler randomized algorithm that finds all minimum cuts with high probability in O(n^2 log n) time. This variant has an optimal RNC parallelization. Both variants improve on the previous best time bound of O(n^2 log^3 n). Other applications of the tree-packing approach are new, nearly tight bounds on the number of near minimum cuts a graph may have and a new data structure for representing them in a space-efficient manner

    Electrical Flows, Laplacian Systems, and Faster Approximation of Maximum Flow in Undirected Graphs

    We introduce a new approach to computing an approximately maximum s-t flow in a capacitated, undirected graph. This flow is computed by solving a sequence of electrical flow problems. Each electrical flow is given by the solution of a system of linear equations in a Laplacian matrix, and thus may be approximately computed in nearly-linear time. Using this approach, we develop the fastest known algorithm for computing approximately maximum s-t flows. For a graph having n vertices and m edges, our algorithm computes a (1-\epsilon)-approximately maximum s-t flow in time \tilde{O}(mn^{1/3} \epsilon^{-11/3}). A dual version of our approach computes a (1+\epsilon)-approximately minimum s-t cut in time \tilde{O}(m+n^{4/3}\eps^{-8/3}), which is the fastest known algorithm for this problem as well. Previously, the best dependence on m and n was achieved by the algorithm of Goldberg and Rao (J. ACM 1998), which can be used to compute approximately maximum s-t flows in time \tilde{O}(m\sqrt{n}\epsilon^{-1}), and approximately minimum s-t cuts in time \tilde{O}(m+n^{3/2}\epsilon^{-3})

    Graph Sample and Hold: A Framework for Big-Graph Analytics

    Sampling is a standard approach in big-graph analytics; the goal is to efficiently estimate the graph properties by consulting a sample of the whole population. A perfect sample is assumed to mirror every property of the whole population. Unfortunately, such a perfect sample is hard to collect in complex populations such as graphs (e.g. web graphs, social networks etc), where an underlying network connects the units of the population. Therefore, a good sample will be representative in the sense that graph properties of interest can be estimated with a known degree of accuracy. While previous work focused particularly on sampling schemes used to estimate certain graph properties (e.g. triangle count), much less is known for the case when we need to estimate various graph properties with the same sampling scheme. In this paper, we propose a generic stream sampling framework for big-graph analytics, called Graph Sample and Hold (gSH). To begin, the proposed framework samples from massive graphs sequentially in a single pass, one edge at a time, while maintaining a small state. We then show how to produce unbiased estimators for various graph properties from the sample. Given that the graph analysis algorithms will run on a sample instead of the whole population, the runtime complexity of these algorithm is kept under control. Moreover, given that the estimators of graph properties are unbiased, the approximation error is kept under control. Finally, we show the performance of the proposed framework (gSH) on various types of graphs, such as social graphs, among others