10,180 research outputs found

    Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

    Full text link
    There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) can fit in the memory of a single commodity multicore server. Nevertheless, most experimental work in the literature report results on much smaller graphs, and the ones for the Hyperlink graph use distributed or external memory. Therefore, it is natural to ask whether we can efficiently solve a broad class of graph problems on this graph in memory. This paper shows that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes. We give implementations of theoretically-efficient parallel algorithms for 20 important graph problems. We also present the optimizations and techniques that we used in our implementations, which were crucial in enabling us to process these large graphs quickly. We show that the running times of our implementations outperform existing state-of-the-art implementations on the largest real-world graphs. For many of the problems that we consider, this is the first time they have been solved on graphs at this scale. We have made the implementations developed in this work publicly-available as the Graph-Based Benchmark Suite (GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 201

    Deterministic Approximation of Random Walks in Small Space

    Get PDF
    We give a deterministic, nearly logarithmic-space algorithm that given an undirected graph G, a positive integer r, and a set S of vertices, approximates the conductance of S in the r-step random walk on G to within a factor of 1+epsilon, where epsilon>0 is an arbitrarily small constant. More generally, our algorithm computes an epsilon-spectral approximation to the normalized Laplacian of the r-step walk. Our algorithm combines the derandomized square graph operation [Eyal Rozenman and Salil Vadhan, 2005], which we recently used for solving Laplacian systems in nearly logarithmic space [Murtagh et al., 2017], with ideas from [Cheng et al., 2015], which gave an algorithm that is time-efficient (while ours is space-efficient) and randomized (while ours is deterministic) for the case of even r (while ours works for all r). Along the way, we provide some new results that generalize technical machinery and yield improvements over previous work. First, we obtain a nearly linear-time randomized algorithm for computing a spectral approximation to the normalized Laplacian for odd r. Second, we define and analyze a generalization of the derandomized square for irregular graphs and for sparsifying the product of two distinct graphs. As part of this generalization, we also give a strongly explicit construction of expander graphs of every size

    Algorithms and Lower Bounds for Cycles and Walks: Small Space and Sparse Graphs

    Get PDF

    An Improved Algorithm for Incremental DFS Tree in Undirected Graphs

    Get PDF
    Depth first search (DFS) tree is one of the most well-known data structures for designing efficient graph algorithms. Given an undirected graph G=(V,E)G=(V,E) with nn vertices and mm edges, the textbook algorithm takes O(n+m)O(n+m) time to construct a DFS tree. In this paper, we study the problem of maintaining a DFS tree when the graph is undergoing incremental updates. Formally, we show: Given an arbitrary online sequence of edge or vertex insertions, there is an algorithm that reports a DFS tree in O(n)O(n) worst case time per operation, and requires O(min{mlogn,n2})O\left(\min\{m \log n, n^2\}\right) preprocessing time. Our result improves the previous O(nlog3n)O(n \log^3 n) worst case update time algorithm by Baswana et al. and the O(nlogn)O(n \log n) time by Nakamura and Sadakane, and matches the trivial Ω(n)\Omega(n) lower bound when it is required to explicitly output a DFS tree. Our result builds on the framework introduced in the breakthrough work by Baswana et al., together with a novel use of a tree-partition lemma by Duan and Zhan, and the celebrated fractional cascading technique by Chazelle and Guibas
    corecore