40 research outputs found

    Ultra-Rare Genetic Variation in the Epilepsies : A Whole-Exome Sequencing Study of 17,606 Individuals

    Get PDF
    Sequencing-based studies have identified novel risk genes associated with severe epilepsies and revealed an excess of rare deleterious variation in less-severe forms of epilepsy. To identify the shared and distinct ultra-rare genetic risk factors for different types of epilepsies, we performed a whole-exome sequencing (WES) analysis of 9,170 epilepsy-affected individuals and 8,436 controls of European ancestry. We focused on three phenotypic groups: severe developmental and epileptic encephalopathies (DEEs), genetic generalized epilepsy (GGE), and non-acquired focal epilepsy (NAFE). We observed that compared to controls, individuals with any type of epilepsy carried an excess of ultra-rare, deleterious variants in constrained genes and in genes previously associated with epilepsy; we saw the strongest enrichment in individuals with DEEs and the least strong in individuals with NAFE. Moreover, we found that inhibitory GABA(A) receptor genes were enriched for missense variants across all three classes of epilepsy, whereas no enrichment was seen in excitatory receptor genes. The larger gene groups for the GABAergic pathway or cation channels also showed a significant mutational burden in DEEs and GGE. Although no single gene surpassed exome-wide significance among individuals with GGE or NAFE, highly constrained genes and genes encoding ion channels were among the lead associations; such genes included CACNAIG, EEF1A2, and GABRG2 for GGE and LGI1, TRIM3, and GABRG2 for NAFE. Our study, the largest epilepsy WES study to date, confirms a convergence in the genetics of severe and less-severe epilepsies associated with ultra-rare coding variation, and it highlights a ubiquitous role for GABAergic inhibition in epilepsy etiology.Peer reviewe

    Transcript expression-aware annotation improves rare variant interpretation.

    Get PDF
    The acceleration of DNA sequencing in samples from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable example of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy individuals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)1, we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the 'proportion expressed across transcripts', which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue samples from the Genotype Tissue Expression (GTEx) project2 and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies

    The ExAC browser: displaying reference data information from over 60 000 exomes

    No full text
    Worldwide, hundreds of thousands of humans have had their genomes or exomes sequenced, and access to the resulting data sets can provide valuable information for variant interpretation and understanding gene function. Here, we present a lightweight, flexible browser framework to display large population datasets of genetic variation. We demonstrate its use for exome sequence data from 60 706 individuals in the Exome Aggregation Consortium (ExAC). The ExAC browser provides gene- and transcript-centric displays of variation, a critical view for clinical applications. Additionally, we provide a variant display, which includes population frequency and functional annotation data as well as short read support for the called variant. This browser is open-source, freely available at http://exac.broadinstitute.org, and has already been used extensively by clinical laboratories worldwide

    The mutational constraint spectrum quantified from variation in 141,456 humans

    Get PDF
    Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes critical for an organism’s function will be depleted for such variants in natural populations, while non-essential genes will tolerate their accumulation. However, predicted loss-of-function (pLoF) variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here, we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence pLoF variants in this cohort after filtering for sequencing and annotation artifacts. Using an improved human mutation rate model, we classify human protein-coding genes along a spectrum representing tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve gene discovery power for both common and rare diseases
    corecore