8,151 research outputs found
An O(1) Solution to the Prefix Sum Problem on a Specialized Memory Architecture
In this paper we study the Prefix Sum problem introduced by Fredman.
We show that it is possible to perform both update and retrieval in O(1) time
simultaneously under a memory model in which individual bits may be shared by
several words.
We also show that two variants (generalizations) of the problem can be solved
optimally in time under the comparison based model of
computation.Comment: 12 page
An O(1) solution to the prefix sum problem on a specialized memory architecture
In this paper we study the Prefix Sum problem introduced by Fredman. We show that it is possible to perform both update and retrieval in O(1) time simultaneously under a memory model in which individual bits may be shared by several words. We also show that two variants (generalizations) of the problem can be solved optimally in Θ (lgN) time under the comparison based model of computation.4th IFIP International Conference on Theoretical Computer ScienceRed de Universidades con Carreras en Informática (RedUNCI
Gunrock: GPU Graph Analytics
For large-scale graph analytics on the GPU, the irregularity of data access
and control flow, and the complexity of programming GPUs, have presented two
significant challenges to developing a programmable high-performance graph
library. "Gunrock", our graph-processing system designed specifically for the
GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on
operations on a vertex or edge frontier. Gunrock achieves a balance between
performance and expressiveness by coupling high performance GPU computing
primitives and optimization strategies with a high-level programming model that
allows programmers to quickly develop new graph primitives with small code size
and minimal GPU programming knowledge. We characterize the performance of
various optimization strategies and evaluate Gunrock's overall performance on
different GPU architectures on a wide range of graph primitives that span from
traversal-based algorithms and ranking algorithms, to triangle counting and
bipartite-graph-based algorithms. The results show that on a single GPU,
Gunrock has on average at least an order of magnitude speedup over Boost and
PowerGraph, comparable performance to the fastest GPU hardwired primitives and
CPU shared-memory graph libraries such as Ligra and Galois, and better
performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing
(TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance
Graph Processing Library on the GPU
- …