13 research outputs found
Determinants of penetrance and variable expressivity in monogenic metabolic conditions across 77,184 exomes
Penetrance of variants in monogenic disease and clinical utility of common polygenic variation has not been well explored on a large-scale. Here, the authors use exome sequencing data from 77,184 individuals to generate penetrance estimates and assess the utility of polygenic variation in risk prediction of monogenic variants
Recommended from our members
Additional Evidence for DDB2 T338M as a Genetic Risk Factor for Ocular Squamous Cell Carcinoma in Horses.
Squamous cell carcinoma (SCC) is the most common periocular cancer in horses and the second most common tumor of the horse overall. A missense mutation in damage-specific DNA-binding protein 2 (DDB2, c.1012 C>T, p.Thr338Met) was previously found to be strongly associated with ocular SCC in Haflinger and Belgian horses, explaining 76% of cases across both breeds. To determine if this same variant in DDB2 contributes to risk for ocular SCC in the Arabian, Appaloosa, and Percheron breeds and to determine if the variant contributes to risk for oral or urogenital SCC, histologically confirmed SCC cases were genotyped for the DDB2 variant and associations were investigated. Horses with urogenital SCC that were heterozygous for the DDB2 risk allele were identified in the Appaloosa breed, but a significant association between the DDB2 variant and SCC occurring at any location in this breed was not detected. The risk allele was not identified in Arabians, and no Percherons were homozygous for the risk allele. High-throughput sequencing data from six Haflingers were analyzed to ascertain if any other variant from the previously associated 483 kb locus on ECA12 was more concordant with the SCC phenotype than the DDB2 variant. Sixty polymorphisms were prioritized for evaluation, and no other variant from this locus explained the genetic risk better than the DDB2 allele (P = 3.39 × 10-17, n = 118). These data provide further support of the DDB2 variant contributing to risk for ocular SCC, specifically in the Haflinger and Belgian breeds
Transcript expression-aware annotation improves rare variant interpretation
The acceleration of DNA sequencing in samples from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable example of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy individuals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)1, we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the ‘proportion expressed across transcripts’, which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue samples from the Genotype Tissue Expression (GTEx) project2 and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies.publishedVersionPeer reviewe
The mutational constraint spectrum quantified from variation in 141,456 humans
Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.publishedVersionPeer reviewe
Recommended from our members
Unique Capabilities of Genome Sequencing for Rare Disease Diagnosis
BackgroundCausal variants underlying rare disorders may remain elusive even after expansive gene panels or exome sequencing (ES). Clinicians and researchers may then turn to genome sequencing (GS), though the added value of this technique and its optimal use remain poorly defined. We therefore investigated the advantages of GS within a phenotypically diverse cohort.MethodsGS was performed for 744 individuals with rare disease who were genetically undiagnosed. Analysis included review of single nucleotide, indel, structural, and mitochondrial variants.ResultsWe successfully solved 218/744 (29.3%) cases using GS, with most solves involving established disease genes (157/218, 72.0%). Of all solved cases, 148 (67.9%) had previously had non-diagnostic ES. We systematically evaluated the 218 causal variants for features requiring GS to identify and 61/218 (28.0%) met these criteria, representing 8.2% of the entire cohort. These included small structural variants (13), copy neutral inversions and complex rearrangements (8), tandem repeat expansions (6), deep intronic variants (15), and coding variants that may be more easily found using GS related to uniformity of coverage (19).ConclusionWe describe the diagnostic yield of GS in a large and diverse cohort, illustrating several types of pathogenic variation eluding ES or other techniques. Our results reveal a higher diagnostic yield of GS, supporting the utility of a genome-first approach, with consideration of GS as a secondary or tertiary test when higher-resolution structural variant analysis is needed or there is a strong clinical suspicion for a condition and prior targeted genetic testing has been negative
Author Correction: The mutational constraint spectrum quantified from variation in 141,456 humans (Nature, (2020), 581, 7809, (434-443), 10.1038/s41586-020-2308-7)
10.1038/s41586-020-03174-8Nature590784
Determinants of penetrance and variable expressivity in monogenic metabolic conditions across 77,184 exomes
Hundreds of thousands of genetic variants have been reported to cause severe monogenic diseases, but the probability that a variant carrier develops the disease (termed penetrance) is unknown for virtually all of them. Additionally, the clinical utility of common polygenetic variation remains uncertain. Using exome sequencing from 77,184 adult individuals (38,618 multi-ancestral individuals from a type 2 diabetes case-control study and 38,566 participants from the UK Biobank, for whom genotype array data were also available), we apply clinical standard-of-care gene variant curation for eight monogenic metabolic conditions. Rare variants causing monogenic diabetes and dyslipidemias display effect sizes significantly larger than the top 1% of the corresponding polygenic scores. Nevertheless, penetrance estimates for monogenic variant carriers average 60% or lower for most conditions. We assess epidemiologic and genetic factors contributing to risk prediction in monogenic variant carriers, demonstrating that inclusion of polygenic variation significantly improves biomarker estimation for two monogenic dyslipidemias
Author Correction: Transcript expression-aware annotation improves rare variant interpretation
In this Article, author Marquis P. Vawter was missing from the Genome Aggregation Database Consortium list. They are associated with the affiliation: ‘Department of Psychiatry & Human Behavior, University of California Irvine, Irvine, CA, USA’, and contributed to the generation of the primary data incorporated into the gnomAD resource. The original Article has been corrected online