56,479 research outputs found
Gunrock: GPU Graph Analytics
For large-scale graph analytics on the GPU, the irregularity of data access
and control flow, and the complexity of programming GPUs, have presented two
significant challenges to developing a programmable high-performance graph
library. "Gunrock", our graph-processing system designed specifically for the
GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on
operations on a vertex or edge frontier. Gunrock achieves a balance between
performance and expressiveness by coupling high performance GPU computing
primitives and optimization strategies with a high-level programming model that
allows programmers to quickly develop new graph primitives with small code size
and minimal GPU programming knowledge. We characterize the performance of
various optimization strategies and evaluate Gunrock's overall performance on
different GPU architectures on a wide range of graph primitives that span from
traversal-based algorithms and ranking algorithms, to triangle counting and
bipartite-graph-based algorithms. The results show that on a single GPU,
Gunrock has on average at least an order of magnitude speedup over Boost and
PowerGraph, comparable performance to the fastest GPU hardwired primitives and
CPU shared-memory graph libraries such as Ligra and Galois, and better
performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing
(TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance
Graph Processing Library on the GPU
Closed Quasi-Fuchsian Surfaces In Hyperbolic Knot Complements
We show that every hyperbolic knot complement contains a closed
quasi-Fuchsian surface.Comment: 69 pages, 27 figures. Made small changes suggested by refere
Algorithms for Rapidly Dispersing Robot Swarms in Unknown Environments
We develop and analyze algorithms for dispersing a swarm of primitive robots
in an unknown environment, R. The primary objective is to minimize the
makespan, that is, the time to fill the entire region. An environment is
composed of pixels that form a connected subset of the integer grid.
There is at most one robot per pixel and robots move horizontally or
vertically at unit speed. Robots enter R by means of k>=1 door pixels
Robots are primitive finite automata, only having local communication, local
sensors, and a constant-sized memory.
We first give algorithms for the single-door case (i.e., k=1), analyzing the
algorithms both theoretically and experimentally. We prove that our algorithms
have optimal makespan 2A-1, where A is the area of R.
We next give an algorithm for the multi-door case (k>1), based on a
wall-following version of the leader-follower strategy. We prove that our
strategy is O(log(k+1))-competitive, and that this bound is tight for our
strategy and other related strategies.Comment: 17 pages, 4 figures, Latex, to appear in Workshop on Algorithmic
Foundations of Robotics, 200
- …