Search CORE

68 research outputs found

Robust Linear Models for Cis-eQTL Analysis

Author: Cecilia M. Lindgren (145706)
Christopher C. Holmes (29850)
Mattias Rantalainen (166786)
Publication venue
Publication date: 18/05/2015
Field of study

<div>Expression Quantitative Trait Loci (eQTL) analysis enables characterisation of functional genetic variation influencing expression levels of individual genes. In outbread populations, including humans, eQTLs are commonly analysed using the conventional linear model, adjusting for relevant covariates, assuming an allelic dosage model and a Gaussian error term. However, gene expression data generally have noise that induces heavy-tailed errors relative to the Gaussian distribution and often include atypical observations, or outliers. Such departures from modelling assumptions can lead to an increased rate of type II errors (false negatives), and to some extent also type I errors (false positives). Careful model checking can reduce the risk of type-I errors but often not type II errors, since it is generally too time-consuming to carefully check all models with a non-significant effect in large-scale and genome-wide studies. Here we propose the application of a robust linear model for eQTL analysis to reduce adverse effects of deviations from the assumption of Gaussian residuals. We present results from a simulation study as well as results from the analysis of real eQTL data sets. Our findings suggest that in many situations robust models have the potential to provide more reliable eQTL results compared to conventional linear models, particularly in respect to reducing type II errors due to non-Gaussian noise. Post-genomic data, such as that generated in genome-wide eQTL studies, are often noisy and frequently contain atypical observations. Robust statistical models have the potential to provide more reliable results and increased statistical power under non-Gaussian conditions. The results presented here suggest that robust models should be considered routinely alongside other commonly used methodologies for eQTL analysis.</div

Directory of Open Access Journals

FigShare

P-value correspondence in Myers et al. data set [25].

Author: Cecilia M. Lindgren (145706)
Christopher C. Holmes (29850)
Mattias Rantalainen (166786)
Publication venue
Publication date
Field of study

Scatter plot of −log10(p-values) from Myers et al. data set [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127882#pone.0127882.ref025" target="_blank">25</a>]. (Key: green = significant in both models, red = significant in the conventional model only, blue = significant in the robust model only, data from points marked with black squares are shown in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127882#pone.0127882.g005" target="_blank">Fig 5</a>)</p

FigShare

Results from comparative analysis of Myers et al. data set [25].

Author: Cecilia M. Lindgren (145706)
Christopher C. Holmes (29850)
Mattias Rantalainen (166786)
Publication venue
Publication date
Field of study

SNP effect size estimates and standard errors for eQTLs significant in both models (A, D), in the robust model only (B, E), and in the linear model only (C, F).</p

FigShare

P-value correspondence in Grundberg et al. data set [26].

Author: Cecilia M. Lindgren (145706)
Christopher C. Holmes (29850)
Mattias Rantalainen (166786)
Publication venue
Publication date
Field of study

Scatter plot of −log10(p-values) from MuTHER data set [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127882#pone.0127882.ref026" target="_blank">26</a>]. (Key: green = significant in both models, red = significant in the conventional model only, blue = significant in the robust model only, data from points marked with black squares are shown in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127882#pone.0127882.s002" target="_blank">S1 Fig</a>).</p

FigShare

Power analysis results (empirical residuals from robust model fit).

Author: Cecilia M. Lindgren (145706)
Christopher C. Holmes (29850)
Mattias Rantalainen (166786)
Publication venue
Publication date
Field of study

A) Residuals from a random sample of eQTL models. B) Residuals from a random sample from models found to be significant only in the robust eQTL model. (‘cont’ = residual from robust model fit of Myers et al. [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127882#pone.0127882.ref025" target="_blank">25</a>] data set; ‘no cont’ = Gaussian residuals.)</p

FigShare

Power analysis results (mixture contamination model).

Author: Cecilia M. Lindgren (145706)
Christopher C. Holmes (29850)
Mattias Rantalainen (166786)
Publication venue
Publication date
Field of study

A) Power as a function of contamination proportion. B) Power as a function of study size. C) Power as a function of the genetic effect size. (Simulation parameters: 10000 samples; A, B and D: N = 100; B, C and D:π = 0.95)</p

FigShare

Concordance (number and proportion of mRNAs with at least one eQTL SNP) between the conventional and robust models (Myers et al. data set [25].

Author: Cecilia M. Lindgren (145706)
Christopher C. Holmes (29850)
Mattias Rantalainen (166786)
Publication venue
Publication date
Field of study

Concordance (number and proportion of mRNAs with at least one eQTL SNP) between the conventional and robust models (Myers et al. data set [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0127882#pone.0127882.ref025" target="_blank">25</a>].</p

FigShare

Power analysis results (heavy-tailed).

Author: Cecilia M. Lindgren (145706)
Christopher C. Holmes (29850)
Mattias Rantalainen (166786)
Publication venue
Publication date
Field of study

A) Power as a function of degrees of freedom in the student t-distribution. B) Power as a function of study size. C) Power as a function of the genetic effect size. (Simulation parameters: 10000 samples, A-B, D: N = 100, B-D:df = 4)</p

FigShare

Comparison of miRNA profiles across tissues.

Author: Anna L. Gloyn (146286)
Cecilia M. Lindgren (145706)
Ignasi Moran (277253)
Jorge Ferrer (25183)
Kyle J. Gaulton (277252)
Leopold Parts (144647)
Mark I. McCarthy (145748)
Martijn van de Bunt (166776)
Paul R. Johnson (277254)
Publication venue
Publication date
Field of study

The left panel (A) shows the single-linkage hierarchical clustering of inter-tissue profile correlations. In the right panel (B) the top 10 most tissue specific islet miRNAs are displayed in descending order. The colors indicate the normalized expression levels of these miRNAs across the different profiles used in the analysis.</p

FigShare

Association results with FTO SNP rs9939609 and NEGR1 SNP rs2815752 in categories of BMI, compared with normal-weight controls.

Author: Ahmed Yousseif (391941)
Andrea Pucci (5444)
Caterina Pelosini (442990)
Cecilia M. Lindgren (145706)
Efthimia Karra (391940)
Ferruccio Santini (77058)
Giorgia Querci (442989)
Mark I. McCarthy (145748)
Rachel L. Batterham (247258)
Reedik Mägi (667294)
Sean Manning (442988)
Publication venue
Publication date
Field of study

β, effect size; SE, standard error.</p

FigShare

Robust Linear Models for Cis-eQTL Analysis

P-value correspondence in Myers <i>et al</i>. data set [25].

Results from comparative analysis of Myers <i>et al</i>. data set [25].

P-value correspondence in Grundberg <i>et al</i>. data set [26].

Power analysis results (empirical residuals from robust model fit).

Power analysis results (mixture contamination model).

Concordance (number and proportion of mRNAs with at least one eQTL SNP) between the conventional and robust models (Myers et al. data set [25].

Power analysis results (heavy-tailed).

Comparison of miRNA profiles across tissues.

Association results with <i>FTO</i> SNP rs9939609 and <i>NEGR1</i> SNP rs2815752 in categories of BMI, compared with normal-weight controls.