Search CORE

3,926 research outputs found

Contract-Based General-Purpose GPU Programming

Author: Kolesnichenko Alexey
Meyer Bertrand
Nanz Sebastian
Poskitt Christopher M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/10/2014
Field of study

Using GPUs as general-purpose processors has revolutionized parallel computing by offering, for a large and growing set of algorithms, massive data-parallelization on desktop machines. An obstacle to widespread adoption, however, is the difficulty of programming them and the low-level control of the hardware required to achieve good performance. This paper suggests a programming library, SafeGPU, that aims at striking a balance between programmer productivity and performance, by making GPU data-parallel operations accessible from within a classical object-oriented programming language. The solution is integrated with the design-by-contract approach, which increases confidence in functional program correctness by embedding executable program specifications into the program text. We show that our library leads to modular and maintainable code that is accessible to GPGPU non-experts, while providing performance that is comparable with hand-written CUDA code. Furthermore, runtime contract checking turns out to be feasible, as the contracts can be executed on the GPU

arXiv.org e-Print Archive

CiteSeerX

Repository for Publications and Research Data

Crossref

Institutional Knowledge at Singapore Management University

Distributed Training Large-Scale Deep Architectures

Author: Chang Edward Y.
Chen Chun-Yen
Chou Chun-Nan
Lin Ting-Wei
Sung Cheng-Lung
Tsao Chia-Chin
Tung Kuan-Chieh
Wu Jui-Lin
Zou Shang-Xuan
Publication venue
Publication date: 10/08/2017
Field of study

Scale of data and scale of computation infrastructures together enable the current deep learning renaissance. However, training large-scale deep architectures demands both algorithmic improvement and careful system configuration. In this paper, we focus on employing the system approach to speed up large-scale training. Via lessons learned from our routine benchmarking effort, we first identify bottlenecks and overheads that hinter data parallelism. We then devise guidelines that help practitioners to configure an effective system and fine-tune parameters to achieve desired speedup. Specifically, we develop a procedure for setting minibatch size and choosing computation algorithms. We also derive lemmas for determining the quantity of key components such as the number of GPUs and parameter servers. Experiments and examples show that these guidelines help effectively speed up large-scale deep learning training

arXiv.org e-Print Archive

Crossref

GPU-Accelerated Algorithms for Compressed Signals Recovery with Application to Astronomical Imagery Deblurring

Author: Fiandrotti Attilio
Fosson Sophie M.
Magli Enrico
Ravazzi Chiara
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2017
Field of study

Compressive sensing promises to enable bandwidth-efficient on-board compression of astronomical data by lifting the encoding complexity from the source to the receiver. The signal is recovered off-line, exploiting GPUs parallel computation capabilities to speedup the reconstruction process. However, inherent GPU hardware constraints limit the size of the recoverable signal and the speedup practically achievable. In this work, we design parallel algorithms that exploit the properties of circulant matrices for efficient GPU-accelerated sparse signals recovery. Our approach reduces the memory requirements, allowing us to recover very large signals with limited memory. In addition, it achieves a tenfold signal recovery speedup thanks to ad-hoc parallelization of matrix-vector multiplications and matrix inversions. Finally, we practically demonstrate our algorithms in a typical application of circulant matrices: deblurring a sparse astronomical image in the compressed domain

arXiv.org e-Print Archive

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Institutional Research Information System University of Turin

PORTO Publications Open Repository TOrino