1,350 research outputs found

    Streaming Complexity of Spanning Tree Computation

    Get PDF
    The semi-streaming model is a variant of the streaming model frequently used for the computation of graph problems. It allows the edges of an n-node input graph to be read sequentially in p passes using OĢƒ(n) space. If the list of edges includes deletions, then the model is called the turnstile model; otherwise it is called the insertion-only model. In both models, some graph problems, such as spanning trees, k-connectivity, densest subgraph, degeneracy, cut-sparsifier, and (Ī”+1)-coloring, can be exactly solved or (1+Īµ)-approximated in a single pass; while other graph problems, such as triangle detection and unweighted all-pairs shortest paths, are known to require Ī©Ģƒ(n) passes to compute. For many fundamental graph problems, the tractability in these models is open. In this paper, we study the tractability of computing some standard spanning trees, including BFS, DFS, and maximum-leaf spanning trees. Our results, in both the insertion-only and the turnstile models, are as follows. Maximum-Leaf Spanning Trees: This problem is known to be APX-complete with inapproximability constant Ļ āˆˆ [245/244, 2). By constructing an Īµ-MLST sparsifier, we show that for every constant Īµ > 0, MLST can be approximated in a single pass to within a factor of 1+Īµ w.h.p. (albeit in super-polynomial time for Īµ ā‰¤ Ļ-1 assuming P ā‰  NP) and can be approximated in polynomial time in a single pass to within a factor of Ļ_n+Īµ w.h.p., where Ļ_n is the supremum constant that MLST cannot be approximated to within using polynomial time and OĢƒ(n) space. In the insertion-only model, these algorithms can be deterministic. BFS Trees: It is known that BFS trees require Ļ‰(1) passes to compute, but the naĆÆve approach needs O(n) passes. We devise a new randomized algorithm that reduces the pass complexity to O(āˆšn), and it offers a smooth tradeoff between pass complexity and space usage. This gives a polynomial separation between single-source and all-pairs shortest paths for unweighted graphs. DFS Trees: It is unknown whether DFS trees require more than one pass. The current best algorithm by Khan and Mehta [STACS 2019] takes OĢƒ(h) passes, where h is the height of computed DFS trees. Note that h can be as large as Ī©(m/n) for n-node m-edge graphs. Our contribution is twofold. First, we provide a simple alternative proof of this result, via a new connection to sparse certificates for k-node-connectivity. Second, we present a randomized algorithm that reduces the pass complexity to O(āˆšn), and it also offers a smooth tradeoff between pass complexity and space usage.ISSN:1868-896

    Low Diameter Graph Decompositions by Approximate Distance Computation

    Get PDF
    In many models for large-scale computation, decomposition of the problem is key to efficient algorithms. For distance-related graph problems, it is often crucial that such a decomposition results in clusters of small diameter, while the probability that an edge is cut by the decomposition scales linearly with the length of the edge. There is a large body of literature on low diameter graph decomposition with small edge cutting probabilities, with all existing techniques heavily building on single source shortest paths (SSSP) computations. Unfortunately, in many theoretical models for large-scale computations, the SSSP task constitutes a complexity bottleneck. Therefore, it is desirable to replace exact SSSP computations with approximate ones. However this imposes a fundamental challenge since the existing constructions of low diameter graph decomposition with small edge cutting probabilities inherently rely on the subtractive form of the triangle inequality, which fails to hold under distance approximation. The current paper overcomes this obstacle by developing a technique termed blurry ball growing. By combining this technique with a clever algorithmic idea of Miller et al. (SPAA 2013), we obtain a construction of low diameter decompositions with small edge cutting probabilities which replaces exact SSSP computations by (a small number of) approximate ones. The utility of our approach is showcased by deriving efficient algorithms that work in the CONGEST, PRAM, and semi-streaming models of computation. As an application, we obtain metric tree embedding algorithms in the vein of Bartal (FOCS 1996) whose computational complexities in these models are optimal up to polylogarithmic factors. Our embeddings have the additional useful property that the tree can be mapped back to the original graph such that each edge is "used" only logaritmically many times, which is of interest for capacitated problems and simulating CONGEST algorithms on the tree into which the graph is embedded

    Parallel Algorithms for Geometric Graph Problems

    Full text link
    We give algorithms for geometric graph problems in the modern parallel models inspired by MapReduce. For example, for the Minimum Spanning Tree (MST) problem over a set of points in the two-dimensional space, our algorithm computes a (1+Ļµ)(1+\epsilon)-approximate MST. Our algorithms work in a constant number of rounds of communication, while using total space and communication proportional to the size of the data (linear space and near linear time algorithms). In contrast, for general graphs, achieving the same result for MST (or even connectivity) remains a challenging open problem, despite drawing significant attention in recent years. We develop a general algorithmic framework that, besides MST, also applies to Earth-Mover Distance (EMD) and the transportation cost problem. Our algorithmic framework has implications beyond the MapReduce model. For example it yields a new algorithm for computing EMD cost in the plane in near-linear time, n1+oĻµ(1)n^{1+o_\epsilon(1)}. We note that while recently Sharathkumar and Agarwal developed a near-linear time algorithm for (1+Ļµ)(1+\epsilon)-approximating EMD, our algorithm is fundamentally different, and, for example, also solves the transportation (cost) problem, raised as an open question in their work. Furthermore, our algorithm immediately gives a (1+Ļµ)(1+\epsilon)-approximation algorithm with nĪ“n^{\delta} space in the streaming-with-sorting model with 1/Ī“O(1)1/\delta^{O(1)} passes. As such, it is tempting to conjecture that the parallel models may also constitute a concrete playground in the quest for efficient algorithms for EMD (and other similar problems) in the vanilla streaming model, a well-known open problem

    Near Optimal Parallel Algorithms for Dynamic DFS in Undirected Graphs

    Full text link
    Depth first search (DFS) tree is a fundamental data structure for solving graph problems. The classical algorithm [SiComp74] for building a DFS tree requires O(m+n)O(m+n) time for a given graph GG having nn vertices and mm edges. Recently, Baswana et al. [SODA16] presented a simple algorithm for updating DFS tree of an undirected graph after an edge/vertex update in O~(n)\tilde{O}(n) time. However, their algorithm is strictly sequential. We present an algorithm achieving similar bounds, that can be adopted easily to the parallel environment. In the parallel model, a DFS tree can be computed from scratch using mm processors in expected O~(1)\tilde{O}(1) time [SiComp90] on an EREW PRAM, whereas the best deterministic algorithm takes O~(n)\tilde{O}(\sqrt{n}) time [SiComp90,JAlg93] on a CRCW PRAM. Our algorithm can be used to develop optimal (upto polylog n factors deterministic algorithms for maintaining fully dynamic DFS and fault tolerant DFS, of an undirected graph. 1- Parallel Fully Dynamic DFS: Given an arbitrary online sequence of vertex/edge updates, we can maintain a DFS tree of an undirected graph in O~(1)\tilde{O}(1) time per update using mm processors on an EREW PRAM. 2- Parallel Fault tolerant DFS: An undirected graph can be preprocessed to build a data structure of size O(m) such that for a set of kk updates (where kk is constant) in the graph, the updated DFS tree can be computed in O~(1)\tilde{O}(1) time using nn processors on an EREW PRAM. Moreover, our fully dynamic DFS algorithm provides, in a seamless manner, nearly optimal (upto polylog n factors) algorithms for maintaining a DFS tree in semi-streaming model and a restricted distributed model. These are the first parallel, semi-streaming and distributed algorithms for maintaining a DFS tree in the dynamic setting.Comment: Accepted to appear in SPAA'17, 32 Pages, 5 Figure

    Equivalence Classes and Conditional Hardness in Massively Parallel Computations

    Get PDF
    The Massively Parallel Computation (MPC) model serves as a common abstraction of many modern large-scale data processing frameworks, and has been receiving increasingly more attention over the past few years, especially in the context of classical graph problems. So far, the only way to argue lower bounds for this model is to condition on conjectures about the hardness of some specific problems, such as graph connectivity on promise graphs that are either one cycle or two cycles, usually called the one cycle vs. two cycles problem. This is unlike the traditional arguments based on conjectures about complexity classes (e.g., P ? NP), which are often more robust in the sense that refuting them would lead to groundbreaking algorithms for a whole bunch of problems. In this paper we present connections between problems and classes of problems that allow the latter type of arguments. These connections concern the class of problems solvable in a sublogarithmic amount of rounds in the MPC model, denoted by MPC(o(log N)), and some standard classes concerning space complexity, namely L and NL, and suggest conjectures that are robust in the sense that refuting them would lead to many surprisingly fast new algorithms in the MPC model. We also obtain new conditional lower bounds, and prove new reductions and equivalences between problems in the MPC model

    Approximate F_2-Sketching of Valuation Functions

    Get PDF
    We study the problem of constructing a linear sketch of minimum dimension that allows approximation of a given real-valued function f : F_2^n - > R with small expected squared error. We develop a general theory of linear sketching for such functions through which we analyze their dimension for most commonly studied types of valuation functions: additive, budget-additive, coverage, alpha-Lipschitz submodular and matroid rank functions. This gives a characterization of how many bits of information have to be stored about the input x so that one can compute f under additive updates to its coordinates. Our results are tight in most cases and we also give extensions to the distributional version of the problem where the input x in F_2^n is generated uniformly at random. Using known connections with dynamic streaming algorithms, both upper and lower bounds on dimension obtained in our work extend to the space complexity of algorithms evaluating f(x) under long sequences of additive updates to the input x presented as a stream. Similar results hold for simultaneous communication in a distributed setting
    • ā€¦
    corecore