2,311 research outputs found

    Tree Contraction, Connected Components, Minimum Spanning Trees: a GPU Path to Vertex Fitting

    Get PDF
    Standard parallel computing operations are considered in the context of algorithms for solving 3D graph problems which have applications, e.g., in vertex finding in HEP. Exploiting GPUs for tree-accumulation and graph algorithms is challenging: GPUs offer extreme computational power and high memory-access bandwidth, combined with a model of fine-grained parallelism perhaps not suiting the irregular distribution of linked representations of graph data structures. Achieving data-race free computations may demand serialization through atomic transactions, inevitably producing poor parallel performance. A Minimum Spanning Tree algorithm for GPUs is presented, its implementation discussed, and its efficiency evaluated on GPU and multicore architectures

    Near-linear Time Algorithm for Approximate Minimum Degree Spanning Trees

    Full text link
    Given a graph G=(V,E)G = (V, E), we wish to compute a spanning tree whose maximum vertex degree, i.e. tree degree, is as small as possible. Computing the exact optimal solution is known to be NP-hard, since it generalizes the Hamiltonian path problem. For the approximation version of this problem, a O~(mn)\tilde{O}(mn) time algorithm that computes a spanning tree of degree at most Δ+1\Delta^* +1 is previously known [F\"urer \& Raghavachari 1994]; here Δ\Delta^* denotes the minimum tree degree of all the spanning trees. In this paper we give the first near-linear time approximation algorithm for this problem. Specifically speaking, we propose an O~(1ϵ7m)\tilde{O}(\frac{1}{\epsilon^7}m) time algorithm that computes a spanning tree with tree degree (1+ϵ)Δ+O(1ϵ2logn)(1+\epsilon)\Delta^* + O(\frac{1}{\epsilon^2}\log n) for any constant ϵ(0,16)\epsilon \in (0,\frac{1}{6}). Thus, when Δ=ω(logn)\Delta^*=\omega(\log n), we can achieve approximate solutions with constant approximate ratio arbitrarily close to 1 in near-linear time.Comment: 17 page

    Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

    Full text link
    There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) can fit in the memory of a single commodity multicore server. Nevertheless, most experimental work in the literature report results on much smaller graphs, and the ones for the Hyperlink graph use distributed or external memory. Therefore, it is natural to ask whether we can efficiently solve a broad class of graph problems on this graph in memory. This paper shows that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes. We give implementations of theoretically-efficient parallel algorithms for 20 important graph problems. We also present the optimizations and techniques that we used in our implementations, which were crucial in enabling us to process these large graphs quickly. We show that the running times of our implementations outperform existing state-of-the-art implementations on the largest real-world graphs. For many of the problems that we consider, this is the first time they have been solved on graphs at this scale. We have made the implementations developed in this work publicly-available as the Graph-Based Benchmark Suite (GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 201

    Spanning Trees with Many Leaves in Graphs without Diamonds and Blossoms

    Full text link
    It is known that graphs on n vertices with minimum degree at least 3 have spanning trees with at least n/4+2 leaves and that this can be improved to (n+4)/3 for cubic graphs without the diamond K_4-e as a subgraph. We generalize the second result by proving that every graph with minimum degree at least 3, without diamonds and certain subgraphs called blossoms, has a spanning tree with at least (n+4)/3 leaves, and generalize this further by allowing vertices of lower degree. We show that it is necessary to exclude blossoms in order to obtain a bound of the form n/3+c. We use the new bound to obtain a simple FPT algorithm, which decides in O(m)+O^*(6.75^k) time whether a graph of size m has a spanning tree with at least k leaves. This improves the best known time complexity for MAX LEAF SPANNING TREE.Comment: 25 pages, 27 Figure

    Near Optimal Parallel Algorithms for Dynamic DFS in Undirected Graphs

    Full text link
    Depth first search (DFS) tree is a fundamental data structure for solving graph problems. The classical algorithm [SiComp74] for building a DFS tree requires O(m+n)O(m+n) time for a given graph GG having nn vertices and mm edges. Recently, Baswana et al. [SODA16] presented a simple algorithm for updating DFS tree of an undirected graph after an edge/vertex update in O~(n)\tilde{O}(n) time. However, their algorithm is strictly sequential. We present an algorithm achieving similar bounds, that can be adopted easily to the parallel environment. In the parallel model, a DFS tree can be computed from scratch using mm processors in expected O~(1)\tilde{O}(1) time [SiComp90] on an EREW PRAM, whereas the best deterministic algorithm takes O~(n)\tilde{O}(\sqrt{n}) time [SiComp90,JAlg93] on a CRCW PRAM. Our algorithm can be used to develop optimal (upto polylog n factors deterministic algorithms for maintaining fully dynamic DFS and fault tolerant DFS, of an undirected graph. 1- Parallel Fully Dynamic DFS: Given an arbitrary online sequence of vertex/edge updates, we can maintain a DFS tree of an undirected graph in O~(1)\tilde{O}(1) time per update using mm processors on an EREW PRAM. 2- Parallel Fault tolerant DFS: An undirected graph can be preprocessed to build a data structure of size O(m) such that for a set of kk updates (where kk is constant) in the graph, the updated DFS tree can be computed in O~(1)\tilde{O}(1) time using nn processors on an EREW PRAM. Moreover, our fully dynamic DFS algorithm provides, in a seamless manner, nearly optimal (upto polylog n factors) algorithms for maintaining a DFS tree in semi-streaming model and a restricted distributed model. These are the first parallel, semi-streaming and distributed algorithms for maintaining a DFS tree in the dynamic setting.Comment: Accepted to appear in SPAA'17, 32 Pages, 5 Figure
    corecore