research

Matrix eQTL: Ultra fast eQTL analysis via large matrix operations

Abstract

Expression quantitative trait loci (eQTL) mapping aims to determine genomic regions that regulate gene transcription. Expression QTL is used to study the regulatory structure of normal tissues and to search for genetic factors in complex diseases such as cancer, diabetes, and cystic fibrosis. A modern eQTL dataset contains millions of SNPs and thousands of transcripts measured for hundreds of samples. This makes the analysis computationally complex as it involves independent testing for association for every transcript-SNP pair. The heavy computational burden makes eQTL analysis less popular, often forces analysts to restrict their attention to just a subset of transcripts and SNPs. As larger genotype and gene expression datasets become available, the demand for fast tools for eQTL analysis increases. We present a new method for fast eQTL analysis via linear models, called Matrix eQTL. Matrix eQTL can model and test for association using both linear regression and ANOVA models. The models can include covariates to account for such factors as population structure, gender, and clinical variables. It also supports testing of heteroscedastic models and models with correlated errors. In our experiment on large datasets Matrix eQTL was thousands of times faster than the existing popular software for QTL/eQTL analysis. Matrix eQTL is implemented as both Matlab and R packages and thus can easily be run on Windows, Mac OS, and Linux systems. The software is freely available at the following address: http://www.bios.unc.edu/research/genomic_software/Matrix_eQTLComment: 9 pages, 1 figur

    Similar works