Parallel 3D Fast Wavelet Transform comparison on CPUs and GPUs

Bernabé, Gregorio

Parallel 3D Fast Wavelet Transform comparison on CPUs and GPUs

Authors: Gregorio Bernabé
Publication date: 1 January 2015
Publisher: University of Granada-University of Cadiz

Abstract

We present in this paper several implementations of the 3D Fast Wavelet Transform (3D-FWT) on multicore CPUs and manycore GPUs. On the GPU side, we focus on CUDA and OpenCL programming to develop methods for an efficient mapping on manycores. On multicore CPUs, OpenMP and Pthreads are used as counterparts to maximize parallelism, and renowned techniques like tiling and blocking are exploited to optimize the use of memory. We evaluate these proposals and make a comparison between a new Fermi Tesla C2050 and an Intel Core 2 QuadQ6700. Speedups of the CUDA version are the best results, improving the execution times on CPU, ranging from 5.3x to 7.4x for different image sizes, and up to 81 times faster when communications are neglected. Meanwhile, OpenCL obtains solid gains which range from 2x factors on small frame sizes to 3x factors on larger ones

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

DIALNET

oai:dialnet.unirioja.es:ART000...

Last time updated on 11/07/2019