9 research outputs found

    The landscape of tolerated genetic variation in humans and primates

    Get PDF

    The landscape of tolerated genetic variation in humans and primates.

    Get PDF
    Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding of their clinical relevance remains largely incomplete. To systematically decipher the effects of human genetic variants, we obtained whole-genome sequencing data for 809 individuals from 233 primate species and identified 4.3 million common protein-altering variants with orthologs in humans. We show that these variants can be inferred to have nondeleterious effects in humans based on their presence at high allele frequencies in other primate populations. We use this resource to classify 6% of all possible human protein-altering variants as likely benign and impute the pathogenicity of the remaining 94% of variants with deep learning, achieving state-of-the-art accuracy for diagnosing pathogenic variants in patients with genetic diseases

    De novo and inherited variants in coding and regulatory regions in genetic cardiomyopathies

    Get PDF
    BACKGROUND: Cardiomyopathies are a leading cause of progressive heart failure and sudden cardiac death; however, their genetic aetiology remains poorly understood. We hypothesised that variants in noncoding regulatory regions and oligogenic inheritance mechanisms may help close the diagnostic gap. METHODS: We first analysed whole-genome sequencing data of 143 parent-offspring trios from Genomics England 100,000 Genomes Project. We used gene panel testing and a phenotype-based, variant prioritisation framework called Exomiser to identify candidate genes in trios. To assess the contribution of noncoding DNVs to cardiomyopathies, we intersected DNVs with open chromatin sequences from single-cell ATAC-seq data of cardiomyocytes. We also performed a case-control analysis in an exome-negative cohort, including 843 probands and 19,467 controls, to assess the association between noncoding variants in known cardiomyopathy genes and disease. RESULTS: In the trio analysis, a definite or probable genetic diagnosis was identified in 21 probands according to the American College of Medical Genetics guidelines. We identified novel DNVs in diagnostic-grade genes (RYR2, TNNT2, PTPN11, MYH7, LZR1, NKX2-5), and five cases harbouring a combination of prioritised variants, suggesting that oligogenic inheritance and genetic modifiers contribute to cardiomyopathies. Phenotype-based ranking of candidate genes identified in noncoding DNV analysis revealed JPH2 as the top candidate. Moreover, a case-control analysis revealed an enrichment of rare noncoding variants in regulatory elements of cardiomyopathy genes (p = .035, OR = 1.43, 95% Cl = 1.095-1.767) versus controls. Of the 25 variants associated with disease (p< 0.5), 23 are novel and nine are predicted to disrupt transcription factor binding motifs. CONCLUSION: Our results highlight complex genetic mechanisms in cardiomyopathies and reveal novel genes for future investigations

    Working toward precision medicine: Predicting phenotypes from exomes in the Critical Assessment of Genome Interpretation (CAGI) challenges

    No full text
    Precision medicine aims to predict a patient's disease risk and best therapeutic options by using that individual's genetic sequencing data. The Critical Assessment of Genome Interpretation (CAGI) is a community experiment consisting of genotype–phenotype prediction challenges; participants build models, undergo assessment, and share key findings. For CAGI 4, three challenges involved using exome-sequencing data: Crohn's disease, bipolar disorder, and warfarin dosing. Previous CAGI challenges included prior versions of the Crohn's disease challenge. Here, we discuss the range of techniques used for phenotype prediction as well as the methods used for assessing predictive models. Additionally, we outline some of the difficulties associated with making predictions and evaluating them. The lessons learned from the exome challenges can be applied to both research and clinical efforts to improve phenotype prediction from genotype. In addition, these challenges serve as a vehicle for sharing clinical and research exome data in a secure manner with scientists who have a broad range of expertise, contributing to a collaborative effort to advance our understanding of genotype–phenotype relationships

    Working toward precision medicine : Predicting phenotypes from exomes in the Critical Assessment of Genome Interpretation (CAGI) challenges

    Get PDF
    Precision medicine aims to predict a patient's disease risk and best therapeutic options by using that individual's genetic sequencing data. The Critical Assessment of Genome Interpretation (CAGI) is a community experiment consisting of genotype-phenotype prediction challenges; participants build models, undergo assessment, and share key findings. For CAGI 4, three challenges involved using exome-sequencing data: Crohn's disease, bipolar disorder, and warfarin dosing. Previous CAGI challenges included prior versions of the Crohn's disease challenge. Here, we discuss the range of techniques used for phenotype prediction as well as the methods used for assessing predictive models. Additionally, we outline some of the difficulties associated with making predictions and evaluating them. The lessons learned from the exome challenges can be applied to both research and clinical efforts to improve phenotype prediction from genotype. In addition, these challenges serve as a vehicle for sharing clinical and research exome data in a secure manner with scientists who have a broad range of expertise, contributing to a collaborative effort to advance our understanding of genotype-phenotype relationships

    Identification of constrained sequence elements across 243 primate genomes

    No full text
    Noncoding DNA is central to our understanding of human gene regulation and complex diseases, and measuring the evolutionary sequence constraint can establish the functional relevance of putative regulatory elements in the human genome. Identifying the genomic elements that have become constrained specifically in primates has remained largely elusive due to the faster evolution of noncoding DNA compared to protein-coding DNA, the relatively short timescales separating primate species, and the previously limited availability of whole genome sequences. Here, we construct a whole genome alignment of 239 species, representing nearly half of all extant species in the primate order. Using this resource, we identified human regulatory elements under selective constraint across primates and other mammals at a 5% false discovery rate. We detect 111,318 DNase I hypersensitivity sites and 267,410 transcription factor binding sites that are constrained specifically in primates but not across other placental mammals and validate their cis-regulatory effects on gene expression. These regulatory elements are enriched for human genetic variants affecting gene expression and complex traits and diseases. Our results highlight the important role of recent evolution in regulatory sequence elements differentiating primates, including humans, from other placental mammals
    corecore