Search CORE

29,028 research outputs found

Improved Imputation of Common and Uncommon Single Nucleotide Polymorphisms (SNPs) with a New Reference Set

Author: Amy Hutchinson
Ann W. Hsing
Brian E. Henderson
Charles C. Chung
Christopher M. Haiman
Daniel Stram
Demetrius Albanes
Jarmo Virtamo
Jennifer L. Stone
Joshua Sampson
Kevin Jacobs
Lauren R. Teras
Margaret Tucker
Mark P. Purdue
Meredith Yeager
Michael A. Eberle
Nilanjan Chatterjee
Phil Taylor
Sonja I. Berndt
Stephen Chanock
Susan M. Gabstur
W. Ryan Diver
Xiang Deng
Zhaoming Wang
Publication venue
Publication date: 07/11/2011
Field of study

Statistical imputation of genotype data is an important technique for analysis of genome-wide association studies (GWAS). We have built a reference dataset to improve imputation accuracy for studies of individuals of primarily European descent using genotype data from the Hap1, Omni1, and Omni2.5 human SNP arrays (Illumina). Our dataset contains 2.5-3.1 million variants for 930 European, 157 Asian, and 162 African/African-American individuals. Imputation accuracy of European data from Hap660 or OmniExpress array content, measured by the proportion of variants imputed with R^2^>0.8, improved by 34%, 23% and 12% for variants with MAF of 3%, 5% and 10%, respectively, compared to imputation using publicly available data from 1,000 Genomes and International HapMap projects. The improved accuracy with the use of the new dataset could increase the power for GWAS by as much as 8% relative to genotyping all variants. This reference dataset is available to the scientific community through the NCBI dbGaP portal. Future versions will include additional genotype data as well as non-European populations

Crossref

Nature Precedings

The Population Genetic Signature of Polygenic Local Adaptation

Author: Berg Jeremy J.
Coop Graham
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

Adaptation in response to selection on polygenic phenotypes may occur via subtle allele frequencies shifts at many loci. Current population genomic techniques are not well posed to identify such signals. In the past decade, detailed knowledge about the specific loci underlying polygenic traits has begun to emerge from genome-wide association studies (GWAS). Here we combine this knowledge from GWAS with robust population genetic modeling to identify traits that may have been influenced by local adaptation. We exploit the fact that GWAS provide an estimate of the additive effect size of many loci to estimate the mean additive genetic value for a given phenotype across many populations as simple weighted sums of allele frequencies. We first describe a general model of neutral genetic value drift for an arbitrary number of populations with an arbitrary relatedness structure. Based on this model we develop methods for detecting unusually strong correlations between genetic values and specific environmental variables, as well as a generalization of

Q_{ST}/F_{ST}

comparisons to test for over-dispersion of genetic values among populations. Finally we lay out a framework to identify the individual populations or groups of populations that contribute to the signal of overdispersion. These tests have considerably greater power than their single locus equivalents due to the fact that they look for positive covariance between like effect alleles, and also significantly outperform methods that do not account for population structure. We apply our tests to the Human Genome Diversity Panel (HGDP) dataset using GWAS data for height, skin pigmentation, type 2 diabetes, body mass index, and two inflammatory bowel disease datasets. This analysis uncovers a number of putative signals of local adaptation, and we discuss the biological interpretation and caveats of these results.Comment: 42 pages including 8 figures and 3 tables; supplementary figures and tables not included on this upload, but are mostly unchanged from v

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

The Francis Crick Institute

Recommended from our members

GenEpi: gene-based epistasis discovery using machine learning.

Author: Alzheimer’s Disease Neuroimaging Initiative
Chang Yu-Chuan
Chen Chien-Yu
Giacomini Kathleen M
Hong Ming-Yi
Hsieh Ping-Han
Oyang Yen-Jen
Tung Yi-An
Wu June-Tai
Yee Sook Wah
Publication venue: eScholarship, University of California
Publication date: 01/02/2020
Field of study

BackgroundGenome-wide association studies (GWAS) provide a powerful means to identify associations between genetic variants and phenotypes. However, GWAS techniques for detecting epistasis, the interactions between genetic variants associated with phenotypes, are still limited. We believe that developing an efficient and effective GWAS method to detect epistasis will be a key for discovering sophisticated pathogenesis, which is especially important for complex diseases such as Alzheimer's disease (AD).ResultsIn this regard, this study presents GenEpi, a computational package to uncover epistasis associated with phenotypes by the proposed machine learning approach. GenEpi identifies both within-gene and cross-gene epistasis through a two-stage modeling workflow. In both stages, GenEpi adopts two-element combinatorial encoding when producing features and constructs the prediction models by L1-regularized regression with stability selection. The simulated data showed that GenEpi outperforms other widely-used methods on detecting the ground-truth epistasis. As real data is concerned, this study uses AD as an example to reveal the capability of GenEpi in finding disease-related variants and variant interactions that show both biological meanings and predictive power.ConclusionsThe results on simulation data and AD demonstrated that GenEpi has the ability to detect the epistasis associated with phenotypes effectively and efficiently. The released package can be generalized to largely facilitate the studies of many complex diseases in the near future

eScholarship - University of California

Association Signals Unveiled by a Comprehensive Gene Set Enrichment Analysis of Dental Caries Genome-Wide Association Studies

Author: Cuenco Karen T.
Feingold Eleanor
Jia Peilin
Marazita Mary L.
Wang Kai
Wang Lily
Wang Quan
Zeng Zhen
Zhao Zhongming
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Gene set-based analysis of genome-wide association study (GWAS) data has recently emerged as a useful approach to examine the joint effects of multiple risk loci in complex human diseases or phenotypes. Dental caries is a common, chronic, and complex disease leading to a decrease in quality of life worldwide. In this study, we applied the approaches of gene set enrichment analysis to a major dental caries GWAS dataset, which consists of 537 cases and 605 controls. Using four complementary gene set analysis methods, we analyzed 1331 Gene Ontology (GO) terms collected from the Molecular Signatures Database (MSigDB). Setting false discovery rate (FDR) threshold as 0.05, we identified 13 significantly associated GO terms. Additionally, 17 terms were further included as marginally associated because they were top ranked by each method, although their FDR is higher than 0.05. In total, we identified 30 promising GO terms, including 'Sphingoid metabolic process,' 'Ubiquitin protein ligase activity,' 'Regulation of cytokine secretion,' and 'Ceramide metabolic process.' These GO terms encompass broad functions that potentially interact and contribute to the oral immune response related to caries development, which have not been reported in the standard single marker based analysis. Collectively, our gene set enrichment analysis provided complementary insights into the molecular mechanisms and polygenic interactions in dental caries, revealing promising association signals that could not be detected through single marker analysis of GWAS data. © 2013 Wang et al

Crossref

Directory of Open Access Journals

PubMed Central

University of Miami: Scholarship@Miami

D-Scholarship@Pitt

The Francis Crick Institute

Genome-wide screening for DNA variants associated with reading and language traits

Author: Bates Timothy C
Brandler William M
DeFries John C
Evans David M
Fisher Simon E
Francks Clyde
Gialluisi Alessandro
Luciano Michelle
Monaco Anthony P
Newbury Dianne F
Olson Richard K
Paracchini Silvia
Pennington Bruce F
Scerri Thomas S
Simpson Nuala H
Smith Shelley D
Stein John F
Talcott Joel B
The SLI Consortium
Wilcutt Erik G
Publication venue: 'Wiley'
Publication date: 15/09/2014
Field of study

This research was funded by: Max Planck Society, the University of St Andrews - Grant Number: 018696, US National Institutes of Health - Grant Number: P50 HD027802, Wellcome Trust - Grant Number: 090532/Z/09/Z, and Medical Research Council Hub Grant Grant Number: G0900747 91070Reading and language abilities are heritable traits that are likely to share some genetic influences with each other. To identify pleiotropic genetic variants affecting these traits, we first performed a genome‐wide association scan (GWAS) meta‐analysis using three richly characterized datasets comprising individuals with histories of reading or language problems, and their siblings. GWAS was performed in a total of 1862 participants using the first principal component computed from several quantitative measures of reading‐ and language‐related abilities, both before and after adjustment for performance IQ. We identified novel suggestive associations at the SNPs rs59197085 and rs5995177 (uncorrected P ≈ 10–7 for each SNP), located respectively at the CCDC136/FLNC and RBFOX2 genes. Each of these SNPs then showed evidence for effects across multiple reading and language traits in univariate association testing against the individual traits. FLNC encodes a structural protein involved in cytoskeleton remodelling, while RBFOX2 is an important regulator of alternative splicing in neurons. The CCDC136/FLNC locus showed association with a comparable reading/language measure in an independent sample of 6434 participants from the general population, although involving distinct alleles of the associated SNP. Our datasets will form an important part of on‐going international efforts to identify genes contributing to reading and language skills.Publisher PDFPeer reviewe

University of St. Andrews - Pure

St Andrews Research Repository

Accurate Genomic Prediction Of Human Height

Author: Avery Steven G.
Campos Gustavo de los
Hsu Stephen D. H.
Lello Louis
Tellier Laurent
Vazquez Ana
Publication venue
Publication date: 07/10/2017
Field of study

We construct genomic predictors for heritable and extremely complex human quantitative traits (height, heel bone density, and educational attainment) using modern methods in high dimensional statistics (i.e., machine learning). Replication tests show that these predictors capture, respectively,

\sim

40, 20, and 9 percent of total variance for the three traits. For example, predicted heights correlate

\sim

0.65 with actual height; actual heights of most individuals in validation samples are within a few cm of the prediction. The variance captured for height is comparable to the estimated SNP heritability from GCTA (GREML) analysis, and seems to be close to its asymptotic value (i.e., as sample size goes to infinity), suggesting that we have captured most of the heritability for the SNPs used. Thus, our results resolve the common SNP portion of the "missing heritability" problem -- i.e., the gap between prediction R-squared and SNP heritability. The

\sim

20k activated SNPs in our height predictor reveal the genetic architecture of human height, at least for common SNPs. Our primary dataset is the UK Biobank cohort, comprised of almost 500k individual genotypes with multiple phenotypes. We also use other datasets and SNPs found in earlier GWAS for out-of-sample validation of our results.Comment: 17 pages, 10 figure

arXiv.org e-Print Archive

Crossref

Sex-specific glioma genome-wide association study identifies new risk locus at 3p21.31 in females, and finds sex-differences in risk at 8q24.21

Author: Rubin Joshua B.
Publication venue: Digital Commons@Becker
Publication date: 01/01/2018
Field of study

Digital Commons@Becker