7 research outputs found

    REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants

    Get PDF
    Supplemental Data Supplemental Data include one figure and five tables and can be found with this article online at http://dx.doi.org/10.1016/j.ajhg.2016.08.016. Supplemental Data Document S1. Figure S1 and Tables S1–S5 Download Document S2. Article plus Supplemental Data Download Web Resources ClinVar, https://www.ncbi.nlm.nih.gov/clinvar/ dbNSFP, https://sites.google.com/site/jpopgen/dbNSFP Human Gene Mutation Database, http://www.hgmd.cf.ac.uk/ REVEL, https://sites.google.com/site/revelgenomics/ SwissVar, http://swissvar.expasy.org/ The vast majority of coding variants are rare, and assessment of the contribution of rare variants to complex traits is hampered by low statistical power and limited functional data. Improved methods for predicting the pathogenicity of rare coding variants are needed to facilitate the discovery of disease variants from exome sequencing studies. We developed REVEL (rare exome variant ensemble learner), an ensemble method for predicting the pathogenicity of missense variants on the basis of individual tools: MutPred, FATHMM, VEST, PolyPhen, SIFT, PROVEAN, MutationAssessor, MutationTaster, LRT, GERP, SiPhy, phyloP, and phastCons. REVEL was trained with recently discovered pathogenic and rare neutral missense variants, excluding those previously used to train its constituent tools. When applied to two independent test sets, REVEL had the best overall performance (p < 10−12) as compared to any individual tool and seven ensemble methods: MetaSVM, MetaLR, KGGSeq, Condel, CADD, DANN, and Eigen. Importantly, REVEL also had the best performance for distinguishing pathogenic from rare neutral variants with allele frequencies <0.5%. The area under the receiver operating characteristic curve (AUC) for REVEL was 0.046–0.182 higher in an independent test set of 935 recent SwissVar disease variants and 123,935 putatively neutral exome sequencing variants and 0.027–0.143 higher in an independent test set of 1,953 pathogenic and 2,406 benign variants recently reported in ClinVar than the AUCs for other ensemble methods. We provide pre-computed REVEL scores for all possible human missense variants to facilitate the identification of pathogenic variants in the sea of rare variants discovered as sequencing studies expand in scale

    Tissue-specific impacts of aging and genetics on gene expression patterns in humans.

    No full text
    Age is the primary risk factor for many common human diseases. Here, we quantify the relative contributions of genetics and aging to gene expression patterns across 27 tissues from 948 humans. We show that the predictive power of expression quantitative trait loci is impacted by age in many tissues. Jointly modelling the contributions of age and genetics to transcript level variation we find expression heritability (h2) is consistent among tissues while the contribution of aging varies by &gt;20-fold with [Formula: see text] in 5 tissues. We find that while the force of purifying selection is stronger on genes expressed early versus late in life (Medawar's hypothesis), several highly proliferative tissues exhibit the opposite pattern. These non-Medawarian tissues exhibit high rates of cancer and age-of-expression-associated somatic mutations. In contrast, genes under genetic control are under relaxed constraint. Together, we demonstrate the distinct roles of aging and genetics on expression phenotypes

    FIRE: functional inference of genetic variants that regulate gene expression

    No full text
    MOTIVATION: Interpreting genetic variation in noncoding regions of the genome is an important challenge for personal genome analysis. One mechanism by which noncoding single nucleotide variants (SNVs) influence downstream phenotypes is through the regulation of gene expression. Methods to predict whether or not individual SNVs are likely to regulate gene expression would aid interpretation of variants of unknown significance identified in whole-genome sequencing studies. RESULTS: We developed FIRE (Functional Inference of Regulators of Expression), a tool to score both noncoding and coding SNVs based on their potential to regulate the expression levels of nearby genes. FIRE consists of 23 random forests trained to recognize SNVs in cis-expression quantitative trait loci (cis-eQTLs) using a set of 92 genomic annotations as predictive features. FIRE scores discriminate cis-eQTL SNVs from non-eQTL SNVs in the training set with a cross-validated area under the receiver operating characteristic curve (AUC) of 0.807, and discriminate cis-eQTL SNVs shared across six populations of different ancestry from non-eQTL SNVs with an AUC of 0.939. FIRE scores are also predictive of cis-eQTL SNVs across a variety of tissue types. AVAILABILITY AND IMPLEMENTATION: FIRE scores for genome-wide SNVs in hg19/GRCh37 are available for download at https://sites.google.com/site/fireregulatoryvariation/. CONTACT: [email protected]. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

    Identification of Susceptibility Loci for Cutaneous Squamous Cell Carcinoma

    No full text
    We report a genome-wide association study (GWAS) of cutaneous squamous cell carcinoma (SCC) conducted among non-Hispanic white (NHW) members of the Kaiser Permanente Northern California (KPNC) health care system. The study includes a genome-wide screen of 61,457 members (6,891 cases and 54,566 controls) genotyped on the Affymetrix Axiom European array and a replication phase involving an independent set of 6,410 additional members (810 cases and 5600 controls). Combined analysis of screening and replication phases identified ten loci containing single-nucleotide polymorphisms (SNPs) with P-values < 5×10(-8). Six loci contain genes in the pigmentation pathway; SNPs at these loci appear to modulate SCC risk independently of the pigmentation phenotypes. Another locus contains HLA class II genes studied in relation to elevated SCC risk following immunosuppression. SNPs at the remaining three loci include an intronic SNP in FOXP1 at locus 3p13, an intergenic SNP at 3q28 near TP63, and an intergenic SNP at 9p22 near BNC2. These findings provide insights into the genetic factors accounting for inherited SCC susceptibility

    REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants

    No full text
    The vast majority of coding variants are rare, and assessment of the contribution of rare variants to complex traits is hampered by low statistical power and limited functional data. Improved methods for predicting the pathogenicity of rare coding variants are needed to facilitate the discovery of disease variants from exome sequencing studies. We developed REVEL (rare exome variant ensemble learner), an ensemble method for predicting the pathogenicity of missense variants on the basis of individual tools: MutPred, FATHMM, VEST, PolyPhen, SIFT, PROVEAN, MutationAssessor, MutationTaster, LRT, GERP, SiPhy, phyloP, and phastCons. REVEL was trained with recently discovered pathogenic and rare neutral missense variants, excluding those previously used to train its constituent tools. When applied to two independent test sets, REVEL had the best overall performance (p < 10(−12)) as compared to any individual tool and seven ensemble methods: MetaSVM, MetaLR, KGGSeq, Condel, CADD, DANN, and Eigen. Importantly, REVEL also had the best performance for distinguishing pathogenic from rare neutral variants with allele frequencies <0.5%. The area under the receiver operating characteristic curve (AUC) for REVEL was 0.046–0.182 higher in an independent test set of 935 recent SwissVar disease variants and 123,935 putatively neutral exome sequencing variants and 0.027–0.143 higher in an independent test set of 1,953 pathogenic and 2,406 benign variants recently reported in ClinVar than the AUCs for other ensemble methods. We provide pre-computed REVEL scores for all possible human missense variants to facilitate the identification of pathogenic variants in the sea of rare variants discovered as sequencing studies expand in scale
    corecore