22 research outputs found
Autism Spectrum Disorder Genetics: Diverse Genes with Diverse Clinical Outcomes
The last several years have seen unprecedented advances in deciphering the genetic etiology of autism spectrum disorders (ASDs). Heritability studies have repeatedly affirmed a contribution of genetic factors to the overall disease risk. Technical breakthroughs have enabled the search for these genetic factors via genome-wide surveys of a spectrum of potential sequence variations, from common single-nucleotide polymorphisms to essentially private chromosomal abnormalities. Studies of copy-number variation have identified significant roles for both recurrent and nonrecurrent large dosage imbalances, although they have rarely revealed the individual genes responsible. More recently, discoveries of rare point mutations and characterization of balanced chromosomal abnormalities have pinpointed individual ASD genes of relatively strong effect, including both loci with strong a priori biological relevance and those that would have otherwise been unsuspected as high-priority biological targets. Evidence has also emerged for association with many common variants, each adding a small individual contribution to ASD risk. These findings collectively provide compelling empirical data that the genetic basis of ASD is highly heterogeneous, with hundreds of genes capable of conferring varying degrees of risk, depending on their nature and the predisposing genetic alteration. Moreover, many genes that have been implicated in ASD also appear to be risk factors for related neurodevelopmental disorders, as well as for a spectrum of psychiatric phenotypes. While some ASD genes have evident functional significance, like synaptic proteins such as the SHANKs, neuroligins, and neurexins, as well as fragile x mental retardation–associated proteins, ASD genes have also been discovered that do not present a clear mechanism of specific neurodevelopmental dysfunction, such as regulators of chromatin modification and global gene expression. In their sum, the progress from genetic studies to date has been remarkable and increasingly rapid, but the interactive impact of strong-effect genetic lesions coupled with weak effect common polymorphisms has not yet led to a unified understanding of ASD pathogenesis or explained its highly variable clinical expression. With an increasingly firm genetic foundation, the coming years will hopefully see equally rapid advances in elucidating the functional consequences of ASD genes and their interactions with environmental/experiential factors, supporting the development of rational interventions
The effect of LRRK2 loss-of-function variants in humans
Analysis of large genomic datasets, including gnomAD, reveals that partial LRRK2 loss of function is not strongly associated with diseases, serving as an example of how human genetics can be leveraged for target validation in drug discovery. Human genetic variants predicted to cause loss-of-function of protein-coding genes (pLoF variants) provide natural in vivo models of human gene inactivation and can be valuable indicators of gene function and the potential toxicity of therapeutic inhibitors targeting these genes(1,2). Gain-of-kinase-function variants in LRRK2 are known to significantly increase the risk of Parkinson's disease(3,4), suggesting that inhibition of LRRK2 kinase activity is a promising therapeutic strategy. While preclinical studies in model organisms have raised some on-target toxicity concerns(5-8), the biological consequences of LRRK2 inhibition have not been well characterized in humans. Here, we systematically analyze pLoF variants in LRRK2 observed across 141,456 individuals sequenced in the Genome Aggregation Database (gnomAD)(9), 49,960 exome-sequenced individuals from the UK Biobank and over 4 million participants in the 23andMe genotyped dataset. After stringent variant curation, we identify 1,455 individuals with high-confidence pLoF variants in LRRK2. Experimental validation of three variants, combined with previous work(10), confirmed reduced protein levels in 82.5% of our cohort. We show that heterozygous pLoF variants in LRRK2 reduce LRRK2 protein levels but that these are not strongly associated with any specific phenotype or disease state. Our results demonstrate the value of large-scale genomic databases and phenotyping of human loss-of-function carriers for target validation in drug discovery.Peer reviewe
Correction:Insights into the genetic epidemiology of Crohn's and rare diseases in the Ashkenazi Jewish population
The data in the S2 Data File does not display correctly. Please view the correct S2 Data File below.</p
Analysis of protein-coding genetic variation in 60,706 humans
Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. We describe the aggregation and analysis of high-quality exome (protein-coding region) sequence data for 60,706 individuals of diverse ethnicities generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of truncating variants with 72% having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human “knockout” variants in protein-coding genes
Analysis of protein-coding genetic variation in 60,706 humans
Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.Peer reviewe
Recommended from our members
ClinVar data parsing
This software repository provides a pipeline for converting raw ClinVar data files into analysis-friendly tab-delimited tables, and also provides these tables for the most recent ClinVar release. Separate tables are generated for genome builds GRCh37 and GRCh38 as well as for mono-allelic variants and complex multi-allelic variants. Additionally, the tables are augmented with allele frequencies from the ExAC and gnomAD datasets as these are often consulted when analyzing ClinVar variants. Overall, this work provides ClinVar data in a format that is easier to work with and can be directly loaded into a variety of popular analysis tools such as R, python pandas, and SQL databases
Reassessment of Mendelian gene pathogenicity using 7,855 cardiomyopathy cases and 60,706 reference samples
Purpose: The accurate interpretation of variation in Mendelian disease genes has lagged behind data generation as sequencing has become increasingly accessible. Ongoing large sequencing efforts present huge interpretive challenges, but they also provide an invaluable opportunity to characterize the spectrum and importance of rare variation. Methods: We analyzed sequence data from 7,855 clinical cardiomyopathy cases and 60,706 Exome Aggregation Consortium (ExAC) reference samples to obtain a better understanding of genetic variation in a representative autosomal dominant disorder. Results: We found that in some genes previously reported as important causes of a given cardiomyopathy, rare variation is not clinically informative because there is an unacceptably high likelihood of false-positive interpretation. By contrast, in other genes, we find that diagnostic laboratories may be overly conservative when assessing variant pathogenicity. Conclusions: We outline improved analytical approaches that evaluate which genes and variant classes are interpretable and propose that these will increase the clinical utility of testing across a range of Mendelian diseases. Genet Med 19 2, 192–203
Effect of predicted protein-truncating genetic variants on the human transcriptome
Accurate prediction of the functional effect of genetic variation is critical for clinical genome interpretation. We systematically characterized the transcriptome effects of protein-truncating variants, a class of variants expected to have profound effects on gene function, using data from the Genotype-Tissue Expression (GTEx) and Geuvadis projects. We quantitated tissue-specific and positional effects on nonsense-mediated transcript decay and present an improved predictive model for this decay. We directly measured the effect of variants both proximal and distal to splice junctions. Furthermore, we found that robustness to heterozygous gene inactivation is not due to dosage compensation. Our results illustrate the value of transcriptome data in the functional interpretation of genetic variants
Transcript expression-aware annotation improves rare variant interpretation
The acceleration of DNA sequencing in samples from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable example of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy individuals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)1, we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the ‘proportion expressed across transcripts’, which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue samples from the Genotype Tissue Expression (GTEx) project2 and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies.publishedVersionPeer reviewe