research

Entropy-based High Performance Computation of Boolean SNP-SNP Interactions Using GPUs

Abstract

It is being increasingly accepted that traditional statistical Single Nucleotide Polymorphism (SNP) analysis of Genome-Wide Association Studies (GWAS) reveals just a small part of the heritability in complex diseases. Study of SNPs interactions identify additional SNPs that contribute to disease but that do not reach genome-wide significance or exhibit only epistatic effects. We have introduced a methodology for genome-wide screening of epistatic interactions which is feasible to be handled by state-of-art high performance computing technology. Unlike standard software, our method computes all boolean binary interactions between SNPs across the whole genome without assuming a particular model of interaction. Our extensive search for epistasis comes at the expense of higher computational complexity, which we tackled using graphics processors (GPUs) to reduce the computational time from several months in a cluster of CPUs to 3-4 days on a multi-GPU platform. Here, we contribute with a new entropy-based function to evaluate the interaction between SNPs which does not compromise findings about the most significant SNP interactions, but is more than 4000 times lighter in terms of computational time when running on GPUs and provides more than 100x faster code than a CPU of similar cost. We deploy a number of optimization techniques to tune the implementation of this function using CUDA and show the way to enhance scalability on larger data sets.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech. This work was also supported by the Australian Research Council Future Fellowship to Prof. Moscato, by a funded grant from the ARC Discovery Project Scheme and by the Ministry of Education of Spain under Project TIN2006-01078 and mobility grant PR2011-0144. We also thank NVIDIA for hardware donation under CUDA Teaching and Research Center awards

    Similar works