1 research outputs found
Streaming Data from HDD to GPUs for Sustained Peak Performance
In the context of the genome-wide association studies (GWAS), one has to
solve long sequences of generalized least-squares problems; such a task has two
limiting factors: execution time --often in the range of days or weeks-- and
data management --data sets in the order of Terabytes. We present an algorithm
that obviates both issues. By pipelining the computation, and thanks to a
sophisticated transfer strategy, we stream data from hard disk to main memory
to GPUs and achieve sustained peak performance; with respect to a
highly-optimized CPU implementation, our algorithm shows a speedup of 2.6x.
Moreover, the approach lends itself to multiple GPUs and attains almost perfect
scalability. When using 4 GPUs, we observe speedups of 9x over the
aforementioned implementation, and 488x over a widespread biology library