119 research outputs found
Recommended from our members
Integrated Enrichment Analysis of Variants and Pathways in Genome-Wide Association Studies Indicates Central Role for IL-2 Signaling Genes in Type 1 Diabetes, and Cytokine Signaling Genes in Crohn's Disease
Pathway analyses of genome-wide association studies aggregate information over sets of related genes, such as genes in common pathways, to identify gene sets that are enriched for variants associated with disease. We develop a model-based approach to pathway analysis, and apply this approach to data from the Wellcome Trust Case Control Consortium (WTCCC) studies. Our method offers several benefits over existing approaches. First, our method not only interrogates pathways for enrichment of disease associations, but also estimates the level of enrichment, which yields a coherent way to promote variants in enriched pathways, enhancing discovery of genes underlying disease. Second, our approach allows for multiple enriched pathways, a feature that leads to novel findings in two diseases where the major histocompatibility complex (MHC) is a major determinant of disease susceptibility. Third, by modeling disease as the combined effect of multiple markers, our method automatically accounts for linkage disequilibrium among variants. Interrogation of pathways from eight pathway databases yields strong support for enriched pathways, indicating links between Crohn's disease (CD) and cytokine-driven networks that modulate immune responses; between rheumatoid arthritis (RA) and āMeaslesā pathway genes involved in immune responses triggered by measles infection; and between type 1 diabetes (T1D) and IL2-mediated signaling genes. Prioritizing variants in these enriched pathways yields many additional putative disease associations compared to analyses without enrichment. For CD and RA, 7 of 8 additional non-MHC associations are corroborated by other studies, providing validation for our approach. For T1D, prioritization of IL-2 signaling genes yields strong evidence for 7 additional non-MHC candidate disease loci, as well as suggestive evidence for several more. Of the 7 strongest associations, 4 are validated by other studies, and 3 (near IL-2 signaling genes RAF1, MAPK14, and FYN) constitute novel putative T1D loci for further study.</p
ebnm: An R Package for Solving the Empirical Bayes Normal Means Problem Using a Variety of Prior Families
The empirical Bayes normal means (EBNM) model is important to many areas of
statistics, including (but not limited to) multiple testing, wavelet denoising,
multiple linear regression, and matrix factorization. There are several
existing software packages that can fit EBNM models under different prior
assumptions and using different algorithms; however, the differences across
interfaces complicate direct comparisons. Further, a number of important prior
assumptions do not yet have implementations. Motivated by these issues, we
developed the R package ebnm, which provides a unified interface for
efficiently fitting EBNM models using a variety of prior assumptions, including
nonparametric approaches. In some cases, we incorporated existing
implementations into ebnm; in others, we implemented new fitting procedures
with a focus on speed and numerical stability. To demonstrate the capabilities
of the unified interface, we compare results using different prior assumptions
in two extended examples: the shrinkage estimation of baseball statistics; and
the matrix factorization of genetics data (via the new R package flashier). In
summary, ebnm is a convenient and comprehensive package for performing EBNM
analyses under a wide range of prior assumptions.Comment: 43 pages, 19 figure
Recommended from our members
A flexible empirical Bayes approach to multivariate multiple regression, and its improved accuracy in predicting multi-tissue gene expression from genotypes
Predicting phenotypes from genotypes is a fundamental task in quantitative genetics. With technological advances, it is now possible to measure multiple phenotypes in large samples. Multiple phenotypes can share their genetic component; therefore, modeling these phenotypes jointly may improve prediction accuracy by leveraging effects that are shared across phenotypes. However, effects can be shared across phenotypes in a variety of ways, so computationally efficient statistical methods are needed that can accurately and flexibly capture patterns of effect sharing. Here, we describe new Bayesian multivariate, multiple regression methods that, by using flexible priors, are able to model and adapt to different patterns of effect sharing and specificity across phenotypes. Simulation results show that these new methods are fast and improve prediction accuracy compared with existing methods in a wide range of settings where effects are shared. Further, in settings where effects are not shared, our methods still perform competitively with state-of-the-art methods. In real data analyses of expression data in the Genotype Tissue Expression (GTEx) project, our methods improve prediction performance on average for all tissues, with the greatest gains in tissues where effects are strongly shared, and in the tissues with smaller sample sizes. While we use gene expression prediction to illustrate our methods, the methods are generally applicable to any multi-phenotype applications, including prediction of polygenic scores and breeding values. Thus, our methods have the potential to provide improvements across fields and organisms
Replication and discovery of musculoskeletal QTLs in LG/J and SM/J advanced intercross lines
AR056280 awarded to DAB and AL. AIHC supported by IMS and Elphinstone Scholarship from the University of Aberdeen. GRV supported by Medical Research Scotland (Vac-929-2016).Peer reviewedPublisher PD
- ā¦