9 research outputs found

    A parallel priority queue with fast updates for GPU architectures

    Full text link
    The high computational throughput of modern graphics processing units (GPUs) make them the de-facto architecture for high-performance computing applications. However, to achieve peak performance, GPUs require highly parallel workloads, as well as memory access patterns that exhibit good locality of reference. As a result, many state-of-the-art algorithms and data structures designed for GPUs sacrifice work-optimality to achieve the necessary parallelism. Furthermore, some abstract data types are avoided completely due to there being no corresponding data structure that performs well on the GPU. One such abstract data type is the priority queue. Many well-known algorithms rely on priority queue operations as a building block. While various priority queue structures have been developed that are parallel, cache-aware, or cache-oblivious, none has been shown to be efficient on GPUs. In this paper, we present the parBucketHeap, a parallel, cache-efficient data structure designed for modern GPU architectures that supports standard priority queue operations, as well as bulk update. We analyze the structure in several well-known computational models and show that it provides both optimal parallelism and is cache-efficient. We implement the parBucketHeap and, using it, we solve the single-source shortest path (SSSP) problem. Experimental results indicate that, for sufficiently large, dense graphs with high diameter, we out-perform current state-of-the-art SSSP algorithms on the GPU by up to a factor of 5. Unlike existing GPU SSSP algorithms, our approach is work-optimal and places significantly less load on the GPU, reducing power consumption

    A Scalable Distributed Parallel Breadth-First Search Algorithm on BlueGene/L.

    Get PDF
    Abstract Many emerging large-scale data science applications require searching large graphs distributed across multiple memories and processors. This paper presents a distributed breadth-first search (BFS) scheme that scales for random graphs with up to three billion vertices and 30 billion edges. Scalability was tested on IBM BlueGene/L with 32,768 nodes at the Lawrence Livermore National Laboratory. Scalability was obtained through a series of optimizations, in particular, those that ensure scalable use of memory. We use 2D (edge) partitioning of the graph instead of conventional 1D (vertex) partitioning to reduce communication overhead. For Poisson random graphs, we show that the expected size of the messages is scalable for both 2D and 1D partitionings. Finally, we have developed efficient collective communication functions for the 3D torus architecture of BlueGene/L that also take advantage of the structure in the problem. The performance and characteristics of the algorithm are measured and reported

    Design and analysis of sequential and parallel single-source shortest-paths algorithms

    Get PDF
    We study the performance of algorithms for the Single-Source Shortest-Paths (SSSP) problem on graphs with n nodes and m edges with nonnegative random weights. All previously known SSSP algorithms for directed graphs required superlinear time. Wie give the first SSSP algorithms that provably achieve linear O(n-m)average-case execution time on arbitrary directed graphs with random edge weights. For independent edge weights, the linear-time bound holds with high probability, too. Additionally, our result implies improved average-case bounds for the All-Pairs Shortest-Paths (APSP) problem on sparse graphs, and it yields the first theoretical average-case analysis for the "Approximate Bucket Implementation" of Dijkstra\u27s SSSP algorithm (ABI-Dijkstra). Futhermore, we give constructive proofs for the existence of graph classes with random edge weights on which ABI-Dijkstra and several other well-known SSSP algorithms require superlinear average-case time. Besides the classical sequential (single processor) model of computation we also consider parallel computing: we give the currently fastest average-case linear-work parallel SSSP algorithms for large graph classes with random edge weights, e.g., sparse rondom graphs and graphs modeling the WWW, telephone calls or social networks.In dieser Arbeit untersuchen wir die Laufzeiten von Algorithmen für das Kürzeste-Wege Problem (Single-Source Shortest-Paths, SSSP) auf Graphen mit n Knoten, M Kanten und nichtnegativen zufälligen Kantengewichten. Alle bisherigen SSSP Algorithmen benötigen auf gerichteten Graphen superlineare Zeit. Wir stellen den ersten SSSP Algorithmus vor, der auf beliebigen gerichteten Graphen mit zufälligen Kantengewichten eine beweisbar lineare average-case-Komplexität O(n+m)aufweist. Sind die Kantengewichte unabhängig, so wird die lineare Zeitschranke auch mit hoher Wahrscheinlichkeit eingehalten. Außerdem impliziert unser Ergebnis verbesserte average-case-Schranken für das All-Pairs Shortest-Paths (APSP) Problem auf dünnen Graphen und liefert die erste theoretische average-case-Analyse für die "Approximate Bucket Implementierung" von Dijkstras SSSP Algorithmus (ABI-Dijkstra). Weiterhin führen wir konstruktive Existenzbeweise für Graphklassen mit zufälligen Kantengewichten, auf denen ABI-Dijkstra und mehrere andere bekannte SSSP Algorithmen durchschnittlich superlineare Zeit benötigen. Neben dem klassischen seriellen (Ein-Prozessor) Berechnungsmodell betrachten wir auch Parallelverarbeitung; für umfangreiche Graphklassen mit zufälligen Kantengewichten wie z.B. dünne Zufallsgraphen oder Modelle für das WWW, Telefonanrufe oder soziale Netzwerke stellen wir die derzeit schnellsten parallelen SSSP Algorithmen mit durchschnittlich linearer Arbeit vor

    On the Design, Analysis, and Implementation of Algorithms for Selected Problems in Graphs and Networks

    Get PDF
    This thesis studies three problems in network optimization, viz., the minimum spanning tree verification (MSTV) problem, the undirected negative cost cycle detection (UNCCD) problem, and the negative cost girth (NCG) problem. These problems find applications in several domains including program verification, proof theory, real-time scheduling, social networking, and operations research.;The MSTV problem is defined as follows: Given an undirected graph G = (V,E) and a spanning tree T, is T a minimum spanning tree of G? We focus on the case where the number of distinct edge weights is bounded. Using a bucketed data structure to organize the edge weights, we present an efficient algorithm for the MSTV problem, which runs in O (| E| + |V| · K) time, where K is the number of distinct edge weights. When K is a fixed constant, this algorithm runs in linear time. We also profile our MSTV algorithm with the current fastest known MSTV implementation. Our results demonstrate the superiority of our algorithm when K ≤ 24.;The UNCCD problem is defined as follows: Given an undirected graph G = (V,E) with arbitrarily weighted edges, does G contain a negative cost cycle? We discuss two polynomial time algorithms for solving the UNCCD problem: the b-matching approach and the T-join approach. We obtain new results for the case where the edge costs are integers in the range {lcub}--K ·· K{rcub}, where K is a positive constant. We also provide the first extensive empirical study that profiles the discussed UNCCD algorithms for various graph types, sizes, and experiments.;The NCG problem is defined as follows: Given a directed graph G = (V,E) with arbitrarily weighted edges, find the length, or number of edges, of the negative cost cycle having the least number of edges. We discuss three strongly polynomial NCG algorithms. The first NCG algorithm is known as the matrix multiplication approach in the literature. We present two new NCG algorithms that are asymptotically and empirically superior to the matrix multiplication approach for sparse graphs. We also provide a parallel implementation of the matrix multiplication approach that runs in polylogarithmic parallel time using a polynomial number of processors. We include an implementation profile to demonstrate the efficiency of the parallel implementation as we increase the graph size and number of processors. We also present an NCG algorithm for planar graphs that is asymptotically faster than the fastest topology-oblivious algorithm when restricted to planar graphs

    -

    Get PDF
    We study the performance of algorithms for the Single-Source Shortest-Paths (SSSP) problem on graphs with n nodes and m edges with nonnegative random weights. All previously known SSSP algorithms for directed graphs required superlinear time. Wie give the first SSSP algorithms that provably achieve linear O(n-m)average-case execution time on arbitrary directed graphs with random edge weights. For independent edge weights, the linear-time bound holds with high probability, too. Additionally, our result implies improved average-case bounds for the All-Pairs Shortest-Paths (APSP) problem on sparse graphs, and it yields the first theoretical average-case analysis for the "Approximate Bucket Implementation" of Dijkstra's SSSP algorithm (ABI-Dijkstra). Futhermore, we give constructive proofs for the existence of graph classes with random edge weights on which ABI-Dijkstra and several other well-known SSSP algorithms require superlinear average-case time. Besides the classical sequential (single processor) model of computation we also consider parallel computing: we give the currently fastest average-case linear-work parallel SSSP algorithms for large graph classes with random edge weights, e.g., sparse rondom graphs and graphs modeling the WWW, telephone calls or social networks.In dieser Arbeit untersuchen wir die Laufzeiten von Algorithmen für das Kürzeste-Wege Problem (Single-Source Shortest-Paths, SSSP) auf Graphen mit n Knoten, M Kanten und nichtnegativen zufälligen Kantengewichten. Alle bisherigen SSSP Algorithmen benötigen auf gerichteten Graphen superlineare Zeit. Wir stellen den ersten SSSP Algorithmus vor, der auf beliebigen gerichteten Graphen mit zufälligen Kantengewichten eine beweisbar lineare average-case-Komplexität O(n+m)aufweist. Sind die Kantengewichte unabhängig, so wird die lineare Zeitschranke auch mit hoher Wahrscheinlichkeit eingehalten. Außerdem impliziert unser Ergebnis verbesserte average-case-Schranken für das All-Pairs Shortest-Paths (APSP) Problem auf dünnen Graphen und liefert die erste theoretische average-case-Analyse für die "Approximate Bucket Implementierung" von Dijkstras SSSP Algorithmus (ABI-Dijkstra). Weiterhin führen wir konstruktive Existenzbeweise für Graphklassen mit zufälligen Kantengewichten, auf denen ABI-Dijkstra und mehrere andere bekannte SSSP Algorithmen durchschnittlich superlineare Zeit benötigen. Neben dem klassischen seriellen (Ein-Prozessor) Berechnungsmodell betrachten wir auch Parallelverarbeitung; für umfangreiche Graphklassen mit zufälligen Kantengewichten wie z.B. dünne Zufallsgraphen oder Modelle für das WWW, Telefonanrufe oder soziale Netzwerke stellen wir die derzeit schnellsten parallelen SSSP Algorithmen mit durchschnittlich linearer Arbeit vor
    corecore