23 research outputs found

    Landscape of multi-nucleotide variants in 125,748 human exomes and 15,708 genomes

    Get PDF
    Multi-nucleotide variants (MNVs), defined as two or more nearby variants existing on the same haplotype in an individual, are a clinically and biologically important class of genetic variation. However, existing tools typically do not accurately classify MNVs, and understanding of their mutational origins remains limited. Here, we systematically survey MNVs in 125,748 whole exomes and 15,708 whole genomes from the Genome Aggregation Database (gnomAD). We identify 1,792,248 MNVs across the genome with constituent variants falling within 2bp distance of one another, including 18,756 variants with a novel combined effect on protein sequence. Finally, we estimate the relative impact of known mutational mechanisms - CpG deamination, replication error by polymerase zeta, and polymerase slippage at repeat junctions - on the generation of MNVs. Our results demonstrate the value of haplotype-aware variant annotation, and refine our understanding of genome-wide mutational mechanisms of MNVs. Multi-nucleotide variants (MNV) are genetic variants in close proximity of each other on the same haplotype whose functional impact is difficult to predict if they reside in the same codon. Here, Wang et al. use the gnomAD dataset to assemble a catalogue of MNVs and estimate their global mutation rate.Peer reviewe

    Analysis of protein-coding genetic variation in 60,706 humans

    Get PDF
    Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. We describe the aggregation and analysis of high-quality exome (protein-coding region) sequence data for 60,706 individuals of diverse ethnicities generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of truncating variants with 72% having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human “knockout” variants in protein-coding genes

    Analysis of protein-coding genetic variation in 60,706 humans

    Get PDF
    Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.Peer reviewe

    Landscape of multi-nucleotide variants in 125,748 human exomes and 15,708 genomes

    Get PDF
    Multi-nucleotide variants (MNVs), defined as two or more nearby variants existing on the same haplotype in an individual, are a clinically and biologically important class of genetic variation. However, existing tools typically do not accurately classify MNVs, and understanding of their mutational origins remains limited. Here, we systematically survey MNVs in 125,748 whole exomes and 15,708 whole genomes from the Genome Aggregation Database (gnomAD). We identify 1,792,248 MNVs across the genome with constituent variants falling within 2 bp distance of one another, including 18,756 variants with a novel combined effect on protein sequence. Finally, we estimate the relative impact of known mutational mechanisms - CpG deamination, replication error by polymerase zeta, and polymerase slippage at repeat junctions - on the generation of MNVs. Our results demonstrate the value of haplotype-aware variant annotation, and refine our understanding of genome-wide mutational mechanisms of MNVs.publishedVersionPeer reviewe

    ASC-1 Is a Cell Cycle Regulator Associated with Severe and Mild Forms of Myopathy.

    Full text link
    OBJECTIVE: Recently, the ASC-1 complex has been identified as a mechanistic link between amyotrophic lateral sclerosis and spinal muscular atrophy (SMA), and 3 mutations of the ASC-1 gene TRIP4 have been associated with SMA or congenital myopathy. Our goal was to define ASC-1 neuromuscular function and the phenotypical spectrum associated with TRIP4 mutations. METHODS: Clinical, molecular, histological, and magnetic resonance imaging studies were made in 5 families with 7 novel TRIP4 mutations. Fluorescence activated cell sorting and Western blot were performed in patient-derived fibroblasts and muscles and in Trip4 knocked-down C2C12 cells. RESULTS: All mutations caused ASC-1 protein depletion. The clinical phenotype was purely myopathic, ranging from lethal neonatal to mild ambulatory adult patients. It included early onset axial and proximal weakness, scoliosis, rigid spine, dysmorphic facies, cutaneous involvement, respiratory failure, and in the older cases, dilated cardiomyopathy. Muscle biopsies showed multiminicores, nemaline rods, cytoplasmic bodies, caps, central nuclei, rimmed fibers, and/or mild endomysial fibrosis. ASC-1 depletion in C2C12 and in patient-derived fibroblasts and muscles caused accelerated proliferation, altered expression of cell cycle proteins, and/or shortening of the G0/G1 cell cycle phase leading to cell size reduction. INTERPRETATION: Our results expand the phenotypical and molecular spectrum of TRIP4-associated disease to include mild adult forms with or without cardiomyopathy, associate ASC-1 depletion with isolated primary muscle involvement, and establish TRIP4 as a causative gene for several congenital muscle diseases, including nemaline, core, centronuclear, and cytoplasmic-body myopathies. They also identify ASC-1 as a novel cell cycle regulator with a key role in cell proliferation, and underline transcriptional coregulation defects as a novel pathophysiological mechanism. ANN NEUROL 2019

    Systematic evaluation of genome sequencing for the diagnostic assessment of autism spectrum disorder and fetal structural anomalies.

    No full text
    Short-read genome sequencing (GS) holds the promise of becoming the primary diagnostic approach for the assessment of autism spectrum disorder (ASD) and fetal structural anomalies (FSAs). However, few studies have comprehensively evaluated its performance against current standard-of-care diagnostic tests: karyotype, chromosomal microarray (CMA), and exome sequencing (ES). To assess the clinical utility of GS, we compared its diagnostic yield against these three tests in 1,612 quartet families including an individual with ASD and in 295 prenatal families. Our GS analytic framework identified a diagnostic variant in 7.8% of ASD probands, almost 2-fold more than CMA (4.3%) and 3-fold more than ES (2.7%). However, when we systematically captured copy-number variants (CNVs) from the exome data, the diagnostic yield of ES (7.4%) was brought much closer to, but did not surpass, GS. Similarly, we estimated that GS could achieve an overall diagnostic yield of 46.1% in unselected FSAs, representing a 17.2% increased yield over karyotype, 14.1% over CMA, and 4.1% over ES with CNV calling or 36.1% increase without CNV discovery. Overall, GS provided an added diagnostic yield of 0.4% and 0.8% beyond the combination of all three standard-of-care tests in ASD and FSAs, respectively. This corresponded to nine GS unique diagnostic variants, including sequence variants in exons not captured by ES, structural variants (SVs) inaccessible to existing standard-of-care tests, and SVs where the resolution of GS changed variant classification. Overall, this large-scale evaluation demonstrated that GS significantly outperforms each individual standard-of-care test while also outperforming the combination of all three tests, thus warranting consideration as the first-tier diagnostic approach for the assessment of ASD and FSAs

    Genome-wide analysis of Structural Variants in Parkinson's Disease.

    No full text
    ObjectiveIdentification of genetic risk factors for Parkinson's disease (PD) has to date been primarily limited to the study of single nucleotide variants, which only represent a small fraction of the genetic variation in the human genome. Consequently, causal variants for most PD risk are not known. Here we focused on structural variants (SVs), which represent a major source of genetic variation in the human genome. We aimed to discover SVs associated with PD risk by performing the first large-scale characterization of SVs in PD.MethodsWe leveraged a recently developed computational pipeline to detect and genotype SVs from 7,772 Illumina short-read whole genome sequencing samples. Using this set of SV variants, we performed a genome-wide association study using 2,585 cases and 2,779 controls and identified SVs associated with PD risk. Furthermore, to validate the presence of these variants, we generated a subset of matched whole-genome long-read sequencing data.ResultsWe genotyped and tested 3,154 common SVs, representing over 412 million nucleotides of previously uncatalogued genetic variation. Using long-read sequencing data, we validated the presence of three novel deletion SVs that are associated with risk of PD from our initial association analysis, including a 2kb intronic deletion within the gene LRRN4.InterpretationWe identify three SVs associated with genetic risk of PD. This study represents the most comprehensive assessment of the contribution of SVs to the genetic risk of PD to date. This article is protected by copyright. All rights reserved

    The mutational constraint spectrum quantified from variation in 141,456 humans

    Get PDF
    Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.publishedVersionPeer reviewe

    De novo TRIM8 variants impair its protein localization to nuclear bodies and cause developmental delay, epilepsy, and focal segmental glomerulosclerosis

    No full text
    Focal segmental glomerulosclerosis (FSGS) is the main pathology underlying steroid-resistant nephrotic syndrome (SRNS) and a leading cause of chronic kidney disease. Monogenic forms of pediatric SRNS are predominantly caused by recessive mutations, while the contribution of de novo variants (DNVs) to this trait is poorly understood. Using exome sequencing (ES) in a proband with FSGS/SRNS, developmental delay, and epilepsy, we discovered a nonsense DNV in TRIM8, which encodes the E3 ubiquitin ligase tripartite motif containing 8. To establish whether TRIM8 variants represent a cause of FSGS, we aggregated exome/genome-sequencing data for 2,501 pediatric FSGS/SRNS-affected individuals and 48,556 control subjects, detecting eight heterozygous TRIM8 truncating variants in affected subjects but none in control subjects (p = 3.28 3 10(-11)). In all six cases with available parental DNA, we demonstrated de novo inheritance (p = 2.21 3 10(-15)). Reverse phenotyping revealed neurodevelopmental disease in all eight families. We next analyzed ES from 9,067 individuals with epilepsy, yielding three additional families with truncating TRIM8 variants. Clinical review revealed FSGS in all. All TRIM8 variants cause protein truncation clustering within the last exon between residues 390 and 487 of the 551 amino acid protein, indicating a correlation between this syndrome and loss of the TRIM8 C-terminal region. Wild-type TRIM8 overexpressed in immortalized human podocytes and neuronal cells localized to nuclear bodies, while constructs harboring patient-specific variants mislocalized diffusely to the nucleoplasm. Co-localization studies demonstrated that Gemini and Cajal bodies frequently abut a TRIM8 nuclear body. Truncating TRIM8 DNVs cause a neuro-renal syndrome via aberrant TRIM8 localization, implicating nuclear bodies in FSGS and developmental brain disease
    corecore