20 research outputs found

    Determinants of penetrance and variable expressivity in monogenic metabolic conditions across 77,184 exomes

    Get PDF
    Hundreds of thousands of genetic variants have been reported to cause severe monogenic diseases, but the probability that a variant carrier develops the disease (termed penetrance) is unknown for virtually all of them. Additionally, the clinical utility of common polygenetic variation remains uncertain. Using exome sequencing from 77,184 adult individuals (38,618 multi-ancestral individuals from a type 2 diabetes case-control study and 38,566 participants from the UK Biobank, for whom genotype array data were also available), we apply clinical standard-of-care gene variant curation for eight monogenic metabolic conditions. Rare variants causing monogenic diabetes and dyslipidemias display effect sizes significantly larger than the top 1% of the corresponding polygenic scores. Nevertheless, penetrance estimates for monogenic variant carriers average 60% or lower for most conditions. We assess epidemiologic and genetic factors contributing to risk prediction in monogenic variant carriers, demonstrating that inclusion of polygenic variation significantly improves biomarker estimation for two monogenic dyslipidemias

    The landscape of tolerated genetic variation in humans and primates

    Get PDF

    Determinants of penetrance and variable expressivity in monogenic metabolic conditions across 77,184 exomes

    Get PDF
    Penetrance of variants in monogenic disease and clinical utility of common polygenic variation has not been well explored on a large-scale. Here, the authors use exome sequencing data from 77,184 individuals to generate penetrance estimates and assess the utility of polygenic variation in risk prediction of monogenic variants

    The landscape of tolerated genetic variation in humans and primates.

    Get PDF
    Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding of their clinical relevance remains largely incomplete. To systematically decipher the effects of human genetic variants, we obtained whole-genome sequencing data for 809 individuals from 233 primate species and identified 4.3 million common protein-altering variants with orthologs in humans. We show that these variants can be inferred to have nondeleterious effects in humans based on their presence at high allele frequencies in other primate populations. We use this resource to classify 6% of all possible human protein-altering variants as likely benign and impute the pathogenicity of the remaining 94% of variants with deep learning, achieving state-of-the-art accuracy for diagnosing pathogenic variants in patients with genetic diseases

    A novel DDB2 mutation causes defective recognition of UV-induced DNA damages and prevalent equine squamous cell carcinoma

    No full text
    Squamous cell carcinoma (SCC) occurs frequently in the human Xeroderma Pigmentosum (XP) syndrome and is characterized by deficient UV-damage repair. SCC is the most common equine ocular cancer and the only associated genetic risk factor is a UV-damage repair protein. Specifically, a missense mutation in horse DDB2 (T338M) was strongly associated with both limbal SCC and third eyelid SCC in three breeds of horses (Halflinger, Belgian, and Rocky Mountain Horses) and was hypothesized to impair binding to UV-damaged DNA. Here, we investigate DDB2-T338M mutant's capacity to recognize UV lesions in vitro and in vivo, together with human XP mutants DDB2-R273H and -K244E. We show that the recombinant DDB2-T338M assembles with DDB1, but fails to show any detectable binding to DNA substrates with or without UV lesions, due to a potential structural disruption of the rigid DNA recognition β-loop. Consistently, we demonstrate that the cellular DDB2-T338M is defective in its recruitment to focally radiated DNA damages, and in its access to chromatin. Thus, we provide direct functional evidence indicating the DDB2-T338M recapitulates molecular defects of human XP mutants, and is the causal loss-of-function allele that gives rise to equine ocular SCCs. Our findings shed new light on the mechanism of DNA recognition by UV-DDB and on the initiation of ocular malignancy

    Variant interpretation using population databases: Lessons from gnomAD

    No full text
    Reference population databases are an essential tool in variant and gene interpretation. Their use guides the identification of pathogenic variants amidst the sea of benign variation present in every human genome, and supports the discovery of new disease–gene relationships. The Genome Aggregation Database (gnomAD) is currently the largest and most widely used publicly available collection of population variation from harmonized sequencing data. The data is available through the online gnomAD browser (https://gnomad.broadinstitute.org/) that enables rapid and intuitive variant analysis. This review provides guidance on the content of the gnomAD browser, and its usage for variant and gene interpretation. We introduce key features including allele frequency, per‐base expression levels, constraint scores, and variant co‐occurrence, alongside guidance on how to use these in analysis, with a focus on the interpretation of candidate variants and novel genes in rare disease. Reference population databases are critical in the interpretation of genomic variation for diagnosing rare disease, and supports the discovery of new disease–gene relationships. This review provides guidance for using the Genome Aggregation Database (gnomAD) browser and key features like allele frequency, per‐base expression levels, constraint scores, and variant co‐occurrence, for variant and gene interpretation in clinical and research analysis

    Recommendations for clinical interpretation of variants found in non-coding regions of the genome.

    Get PDF
    BackgroundThe majority of clinical genetic testing focuses almost exclusively on regions of the genome that directly encode proteins. The important role of variants in non-coding regions in penetrant disease is, however, increasingly being demonstrated, and the use of whole genome sequencing in clinical diagnostic settings is rising across a large range of genetic disorders. Despite this, there is no existing guidance on how current guidelines designed primarily for variants in protein-coding regions should be adapted for variants identified in other genomic contexts.MethodsWe convened a panel of nine clinical and research scientists with wide-ranging expertise in clinical variant interpretation, with specific experience in variants within non-coding regions. This panel discussed and refined an initial draft of the guidelines which were then extensively tested and reviewed by external groups.ResultsWe discuss considerations specifically for variants in non-coding regions of the genome. We outline how to define candidate regulatory elements, highlight examples of mechanisms through which non-coding region variants can lead to penetrant monogenic disease, and outline how existing guidelines can be adapted for the interpretation of these variants.ConclusionsThese recommendations aim to increase the number and range of non-coding region variants that can be clinically interpreted, which, together with a compatible phenotype, can lead to new diagnoses and catalyse the discovery of novel disease mechanisms

    Transcript expression-aware annotation improves rare variant interpretation

    Get PDF
    The acceleration of DNA sequencing in samples from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable example of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy individuals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)1, we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the ‘proportion expressed across transcripts’, which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue samples from the Genotype Tissue Expression (GTEx) project2 and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies.publishedVersionPeer reviewe
    corecore