Search CORE

8,151 research outputs found

An O(1) Solution to the Prefix Sum Problem on a Specialized Memory Architecture

Author: Brodnik Andrej
Karlsson Johan
Munro J. Ian
Nilsson Andreas
Publication venue
Publication date: 01/01/2006
Field of study

\Theta(\lg N)

time under the comparison based model of computation.Comment: 12 page

arXiv.org e-Print Archive

CiteSeerX

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Luleå University of Technology Publications

An O(1) solution to the prefix sum problem on a specialized memory architecture

Author: Brodnik Andrej
Karlsson Johan
Munro J. Ian
Nilsson Andreas
Publication venue
Publication date: 20/11/2012
Field of study

In this paper we study the Prefix Sum problem introduced by Fredman. We show that it is possible to perform both update and retrieval in O(1) time simultaneously under a memory model in which individual bits may be shared by several words. We also show that two variants (generalizations) of the problem can be solved optimally in Θ (lgN) time under the comparison based model of computation.4th IFIP International Conference on Theoretical Computer ScienceRed de Universidades con Carreras en Informática (RedUNCI

Servicio de Difusión de la Creación Intelectual

Gunrock: GPU Graph Analytics

Author: Davidson Andrew
Liu Weitang
Osama Muhammad
Owens John D.
Pan Yuechao
Riffel Andy T.
Wang Leyuan
Wang Yangzihao
Wu Yuduo
Yang Carl
Yuan Chenshan
Publication venue
Publication date: 04/01/2017
Field of study

For large-scale graph analytics on the GPU, the irregularity of data access and control flow, and the complexity of programming GPUs, have presented two significant challenges to developing a programmable high-performance graph library. "Gunrock", our graph-processing system designed specifically for the GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on operations on a vertex or edge frontier. Gunrock achieves a balance between performance and expressiveness by coupling high performance GPU computing primitives and optimization strategies with a high-level programming model that allows programmers to quickly develop new graph primitives with small code size and minimal GPU programming knowledge. We characterize the performance of various optimization strategies and evaluate Gunrock's overall performance on different GPU architectures on a wide range of graph primitives that span from traversal-based algorithms and ranking algorithms, to triangle counting and bipartite-graph-based algorithms. The results show that on a single GPU, Gunrock has on average at least an order of magnitude speedup over Boost and PowerGraph, comparable performance to the fastest GPU hardwired primitives and CPU shared-memory graph libraries such as Ligra and Galois, and better performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing (TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance Graph Processing Library on the GPU

arXiv.org e-Print Archive

eScholarship - University of California

FigShare