In order to associate complex traits with genetic polymorphisms, genome-wide
association studies process huge datasets involving tens of thousands of
individuals genotyped for millions of polymorphisms. When handling these
datasets, which exceed the main memory of contemporary computers, one faces two
distinct challenges: 1) Millions of polymorphisms come at the cost of hundreds
of Gigabytes of genotype data, which can only be kept in secondary storage; 2)
the relatedness of the test population is represented by a covariance matrix,
which, for large populations, can only fit in the combined main memory of a
distributed architecture. In this paper, we present solutions for both
challenges: The genotype data is streamed from and to secondary storage using a
double buffering technique, while the covariance matrix is kept across the main
memory of a distributed memory system. We show that these methods sustain
high-performance and allow the analysis of enormous datase