Search CORE

arXiv.org e-Print Archive

An Empirical Bayes Approach for Multiple Tissue eQTL Analysis

Author: Li Gen
Nobel Andrew B.
Rusyn Ivan
Shabalin Andrey A.
Wright Fred A.
Publication venue
Publication date: 06/09/2017
Field of study

Expression quantitative trait loci (eQTL) analyses, which identify genetic markers associated with the expression of a gene, are an important tool in the understanding of diseases in human and other populations. While most eQTL studies to date consider the connection between genetic variation and expression in a single tissue, complex, multi-tissue data sets are now being generated by the GTEx initiative. These data sets have the potential to improve the findings of single tissue analyses by borrowing strength across tissues, and the potential to elucidate the genotypic basis of differences between tissues. In this paper we introduce and study a multivariate hierarchical Bayesian model (MT-eQTL) for multi-tissue eQTL analysis. MT-eQTL directly models the vector of correlations between expression and genotype across tissues. It explicitly captures patterns of variation in the presence or absence of eQTLs, as well as the heterogeneity of effect sizes across tissues. Moreover, the model is applicable to complex designs in which the set of donors can (i) vary from tissue to tissue, and (ii) exhibit incomplete overlap between tissues. The MT-eQTL model is marginally consistent, in the sense that the model for a subset of tissues can be obtained from the full model via marginalization. Fitting of the MT-eQTL model is carried out via empirical Bayes, using an approximate EM algorithm. Inferences concerning eQTL detection and the configuration of eQTLs across tissues are derived from adaptive thresholding of local false discovery rates, and maximum a-posteriori estimation, respectively. We investigate the MT-eQTL model through a simulation study, and rigorously establish the FDR control of the local FDR testing procedure under mild assumptions appropriate for dependent data.Comment: accepted by Biostatistic

Reconstruction of a low-rank matrix in the presence of Gaussian noise

Author: Nobel Andrew B.
Shabalin Andrey A.
Publication venue
Publication date: 01/01/2013
Field of study

This paper addresses the problem of reconstructing a low-rank signal matrix observed with additive Gaussian noise. We first establish that, under mild assumptions, one can restrict attention to orthogonally equivariant reconstruction methods, which act only on the singular values of the observed matrix and do not affect its singular vectors. Using recent results in random matrix theory, we then propose a new reconstruction method that aims to reverse the effect of the noise on the singular value decomposition of the signal matrix. In conjunction with the proposed reconstruction method we also introduce a Kolmogorov–Smirnov based estimator of the noise variance

arXiv.org e-Print Archive

Finding large average submatrices in high dimensional data

Author: Nobel Andrew B.
Perou Charles M.
Shabalin Andrey A.
Weigman Victor J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2009
Field of study

The search for sample-variable associations is an important problem in the exploratory analysis of high dimensional data. Biclustering methods search for sample-variable associations in the form of distinguished submatrices of the data matrix. (The rows and columns of a submatrix need not be contiguous.) In this paper we propose and evaluate a statistically motivated biclustering procedure (LAS) that finds large average submatrices within a given real-valued data matrix. The procedure operates in an iterative-residual fashion, and is driven by a Bonferroni-based significance score that effectively trades off between submatrix size and average value. We examine the performance and potential utility of LAS, and compare it with a number of existing methods, through an extensive three-part validation study using two gene expression datasets. The validation study examines quantitative properties of biclusters, biological and clinical assessments using auxiliary information, and classification of disease subtypes using bicluster membership. In addition, we carry out a simulation study to assess the effectiveness and noise sensitivity of the LAS search procedure. These results suggest that LAS is an effective exploratory tool for the discovery of biologically relevant structures in high dimensional data. Software is available at https://genome.unc.edu/las/.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS239 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

Crossref

Computational tools for discovery and interpretation of expression quantitative trait loci

Author: Rusyn Ivan
Shabalin Andrey A
Wright Fred A
Publication venue
Publication date: 01/01/2012
Field of study

Expression quantitative trait locus (eQTL) analysis is rapidly moving from a cutting-edge concept in genomics to a mature area of investigation, with important connections to genome-wide association studies for human disease, pharmacogenomics and toxicogenomics. Despite the importance of the topic, many investigators must develop their own code or use tools not specifically suited for eQTL analysis. Convenient computational tools are becoming available, but they are not widely publicized, and investigators who are interested in discovery or eQTL, or in using them to interpret genome-wide association study results may have difficulty navigating the available resources. The purpose of this review is to help investigators find appropriate programs for eQTL analysis and interpretation

Helsingin yliopiston digitaalinen arkisto

Genome-wide association study meta-analysis of suicide death and suicidal behavior

Author: Coon Hilary
DiBlasi Emily
FinnGen
Int Suicide Genetics Consortium
Li Qingqin S.
Palotie Aarno
Shabalin Andrey A.
Publication venue
Publication date: 01/02/2023
Field of study

Suicide is a worldwide health crisis. We aimed to identify genetic risk variants associated with suicide death and suicidal behavior. Meta-analysis for suicide death was performed using 3765 cases from Utah and matching 6572 controls of European ancestry. Meta-analysis for suicidal behavior using data across five cohorts (n = 8315 cases and 256,478 psychiatric or populational controls of European ancestry) was also performed. One locus in neuroligin 1 (NLGN1) passing the genome-wide significance threshold for suicide death was identified (top SNP rs73182688, with p = 5.48 x 10(-8) before and p = 4.55 x 10(-8) after mtCOJO analysis conditioning on MDD to remove genetic effects on suicide mediated by MDD). Conditioning on suicidal attempts did not significantly change the association strength (p = 6.02 x 10(-8)), suggesting suicide death specificity. NLGN1 encodes a member of a family of neuronal cell surface proteins. Members of this family act as splice site-specific ligands for beta-neurexins and may be involved in synaptogenesis. The NRXN-NLGN pathway was previously implicated in suicide, autism, and schizophrenia. We additionally identified ROBO2 and ZNF28 associations with suicidal behavior in the meta-analysis across five cohorts in gene-based association analysis using MAGMA. Lastly, we replicated two loci including variants near SOX5 and LOC101928519 associated with suicidal attempts identified in the ISGC and MVP meta-analysis using the independent FinnGen samples. Suicide death and suicidal behavior showed positive genetic correlations with depression, schizophrenia, pain, and suicidal attempt, and negative genetic correlation with educational attainment. These correlations remained significant after conditioning on depression, suggesting pleiotropic effects among these traits. Bidirectional generalized summary-data-based Mendelian randomization analysis suggests that genetic risk for the suicidal attempt and suicide death are both bi-directionally causal for MDD.Peer reviewe

seeQTL: a searchable database for human eQTLs

Author: Andrey A. Shabalin
Benjamini
Choy
Consoli
Dimas
Donlin
Fei Zou
Feuk
Fred A. Wright
Gamazon
Grundberg
Holm
Kai Xia
Montgomery
Myers
Patrick F. Sullivan
Pickrell
Price
Schadt
Shunping Huang
Spielman
Stranger
Vered Madar
Wei Sun
Wei Wang
Yi-Hui Zhou
Zeller
Publication venue: Oxford University Press
Publication date: 01/01/2012
Field of study

Summary: seeQTL is a comprehensive and versatile eQTL database, including various eQTL studies and a meta-analysis of HapMap eQTL information. The database presents eQTL association results in a convenient browser, using both segmented local-association plots and genome-wide Manhattan plots

Crossref

Deep Sequencing of Three Loci Implicated in Large-Scale Genome-Wide Association Study Smoking Meta-Analyses

Author: Aberg Karolina
Adkins Daniel
Clark Shaunna L.
Collins Ann
Copeland William
Crowley James
Elizabeth J. Costello
Gao Guimin
Hillard Christopher
Kumar Gaurav
Maes Hermine
McClay Joseph
Nerella Sri
Peterson Roseann
Quakenbush Corey
Shabalin Andrey
Silberg Judy
Sullivan Patrick
van den Oord Edwin J.
Xie Linying
Publication venue
Publication date: 01/01/2016
Field of study

Genome-wide association study meta-analyses have robustly implicated three loci that affect susceptibility for smoking: CHRNA5\CHRNA3\CHRNB4, CHRNB3\CHRNA6 and EGLN2\CYP2A6. Functional follow-up studies of these loci are needed to provide insight into biological mechanisms. However, these efforts have been hampered by a lack of knowledge about the specific causal variant(s) involved. In this study, we prioritized variants in terms of the likelihood they account for the reported associations. We employed targeted capture of the CHRNA5\CHRNA3\CHRNB4, CHRNB3\CHRNA6, and EGLN2\CYP2A6 loci and flanking regions followed by next-generation deep sequencing (mean coverage 78×) to capture genomic variation in 363 individuals. We performed single locus tests to determine if any single variant accounts for the association, and examined if sets of (rare) variants that overlapped with biologically meaningful annotations account for the associations. In total, we investigated 963 variants, of which 71.1% were rare (minor allele frequency < 0.01), 6.02% were insertion/deletions, and 51.7% were catalogued in dbSNP141. The single variant results showed that no variant fully accounts for the association in any region. In the variant set results, CHRNB4 accounts for most of the signal with significant sets consisting of directly damaging variants. CHRNA6 explains most of the signal in the CHRNB3\CHRNA6 locus with significant sets indicating a regulatory role for CHRNA6. Significant sets in CYP2A6 involved directly damaging variants while the significant variant sets suggested a regulatory role for EGLN2. We found that multiple variants implicating multiple processes explain the signal. Some variants can be prioritized for functional follow-up. © The Author 2015. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: [email protected]

FastMap: Fast eQTL mapping in homozygous populations

Author: Andrew B. Nobel
Andrey A. Shabalin
Beck
Broman
Bystrykh
Carlborg
Cervino
Chesler
Churchill
Churchill
Daniel M. Gatti
Doerge
Dupuis
Frazer
Frazer
Fred A. Wright
Gatti
Haley
Hillebrandt
Ivan Rusyn
Kadarmideen
Kang
Kao
Kendziorski
Kent
Kong
Lander
Manly
McClurg
McClurg
Mehrabian
Peirce
Pletcher
Pontius
Pritchard
Roberts
Roberts
Schadt
Storey
Szatkiewicz
Tieu-Chong Lam
Wang
Wang
Yang
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Motivation: Gene expression Quantitative Trait Locus (eQTL) mapping measures the association between transcript expression and genotype in order to find genomic locations likely to regulate transcript expression. The availability of both gene expression and high-density genotype data has improved our ability to perform eQTL mapping in inbred mouse and other homozygous populations. However, existing eQTL mapping software does not scale well when the number of transcripts and markers are on the order of 105 and 105–106, respectively

Crossref