53 research outputs found

    Differences in 5'untranslated regions highlight the importance of translational regulation of dosage sensitive genes

    Get PDF
    Background Untranslated regions (UTRs) are important mediators of post-transcriptional regulation. The length of UTRs and the composition of regulatory elements within them are known to vary substantially across genes, but little is known about the reasons for this variation in humans. Here, we set out to determine whether this variation, specifically in 5’UTRs, correlates with gene dosage sensitivity. Results We investigate 5’UTR length, the number of alternative transcription start sites, the potential for alternative splicing, the number and type of upstream open reading frames (uORFs) and the propensity of 5’UTRs to form secondary structures. We explore how these elements vary by gene tolerance to loss-of-function (LoF; using the LOEUF metric), and in genes where changes in dosage are known to cause disease. We show that LOEUF correlates with 5’UTR length and complexity. Genes that are most intolerant to LoF have longer 5’UTRs, greater TSS diversity, and more upstream regulatory elements than their LoF tolerant counterparts. We show that these differences are evident in disease gene-sets, but not in recessive developmental disorder genes where LoF of a single allele is tolerated. Conclusions Our results confirm the importance of post-transcriptional regulation through 5'UTRs in tight regulation of mRNA and protein levels, particularly for genes where changes in dosage are deleterious and lead to disease. Finally, to support gene-based investigation we release a web-based browser tool, VuTR, that supports exploration of the composition of individual 5'UTRs and the impact of genetic variation within them

    Sequential targeted exome sequencing of 1001 patients affected by unexplained limb-girdle weakness

    Get PDF
    Several hundred genetic muscle diseases have been described, all of which are rare. Their clinical and genetic heterogeneity means that a genetic diagnosis is challenging. We established an international consortium, MYO-SEQ, to aid the work-ups of muscle disease patients and to better understand disease etiology. Exome sequencing was applied to 1001 undiagnosed patients recruited from more than 40 neuromuscular disease referral centers; standardized phenotypic information was collected for each patient. Exomes were examined for variants in 429 genes associated with muscle conditions. We identified suspected pathogenic variants in 52% of patients across 87 genes. We detected 401 novel variants, 116 of which were recurrent. Variants in CAPN3, DYSF, ANO5, DMD, RYR1, TTN, COL6A2, and SGCA collectively accounted for over half of the solved cases; while variants in newer disease genes, such as BVES and POGLUT1, were also found. The remaining well-characterized unsolved patients (48%) need further investigation. Using our unique infrastructure, we developed a pathway to expedite muscle disease diagnoses. Our data suggest that exome sequencing should be used for pathogenic variant detection in patients with suspected genetic muscle diseases, focusing first on the most common disease genes described here, and subsequently in rarer and newly characterized disease genes

    Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel

    Get PDF
    A major use of the 1000 Genomes Project (1000GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can take advantage of relatedness between sequenced and non-sequenced samples to improve accuracy. We use this method to create a new 1000GP haplotype reference set for use by the human genetic community. Using a set of validation genotypes at SNP and bi-allelic indels we show that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low-frequency variants. © 2014 Macmillan Publishers Limited. All rights reserved

    Comprehensive Analysis of Genetic Ancestry and Its Molecular Correlates in Cancer

    Get PDF
    We evaluated ancestry effects on mutation rates, DNA methylation, and mRNA and miRNA expression among 10,678 patients across 33 cancer types from The Cancer Genome Atlas. We demonstrated that cancer subtypes and ancestry-related technical artifacts are important confounders that have been insufficiently accounted for. Once accounted for, ancestry-associated differences spanned all molecular features and hundreds of genes. Biologically significant differences were usually tissue specific but not specific to cancer. However, admixture and pathway analyses suggested some of these differences are causally related to cancer. Specific findings included increased FBXW7 mutations in patients of African origin, decreased VHL and PBRM1 mutations in renal cancer patients of African origin, and decreased immune activity in bladder cancer patients of East Asian origin

    De novo variants in the RNU4-2 snRNA cause a frequent neurodevelopmental syndrome

    Get PDF
    Around 60% of individuals with neurodevelopmental disorders (NDD) remain undiagnosed after comprehensive genetic testing, primarily of protein-coding genes1. Large genome-sequenced cohorts are improving our ability to discover new diagnoses in the non-coding genome. Here, we identify the non-coding RNA RNU4-2 as a syndromic NDD gene. RNU4-2 encodes the U4 small nuclear RNA (snRNA), which is a critical component of the U4/U6.U5 tri-snRNP complex of the major spliceosome2. We identify an 18 bp region of RNU4-2 mapping to two structural elements in the U4/U6 snRNA duplex (the T-loop and Stem III) that is severely depleted of variation in the general population, but in which we identify heterozygous variants in 115 individuals with NDD. Most individuals (77.4%) have the same highly recurrent single base insertion (n.64_65insT). In 54 individuals where it could be determined, the de novo variants were all on the maternal allele. We demonstrate that RNU4-2 is highly expressed in the developing human brain, in contrast to RNU4-1 and other U4 homologs. Using RNA-sequencing, we show how 5’ splice site usage is systematically disrupted in individuals with RNU4-2 variants, consistent with the known role of this region during spliceosome activation. Finally, we estimate that variants in this 18 bp region explain 0.4% of individuals with NDD. This work underscores the importance of non-coding genes in rare disorders and will provide a diagnosis to thousands of individuals with NDD worldwide

    Determinants of penetrance and variable expressivity in monogenic metabolic conditions across 77,184 exomes

    Get PDF
    Hundreds of thousands of genetic variants have been reported to cause severe monogenic diseases, but the probability that a variant carrier develops the disease (termed penetrance) is unknown for virtually all of them. Additionally, the clinical utility of common polygenetic variation remains uncertain. Using exome sequencing from 77,184 adult individuals (38,618 multi-ancestral individuals from a type 2 diabetes case-control study and 38,566 participants from the UK Biobank, for whom genotype array data were also available), we apply clinical standard-of-care gene variant curation for eight monogenic metabolic conditions. Rare variants causing monogenic diabetes and dyslipidemias display effect sizes significantly larger than the top 1% of the corresponding polygenic scores. Nevertheless, penetrance estimates for monogenic variant carriers average 60% or lower for most conditions. We assess epidemiologic and genetic factors contributing to risk prediction in monogenic variant carriers, demonstrating that inclusion of polygenic variation significantly improves biomarker estimation for two monogenic dyslipidemias

    Recessive DES cardio/myopathy without myofibrillar aggregates: intronic splice variant silences one allele leaving only missense L190P-desmin

    No full text
    We establish autosomal recessive DES variants p.(Leu190Pro) and a deep intronic splice variant causing inclusion of a frameshift-inducing artificial exon/intronic fragment, as the likely cause of myopathy with cardiac involvement in female siblings. Both sisters presented in their twenties with slowly progressive limb girdle weakness, severe systolic dysfunction, and progressive, severe respiratory weakness. Desmin is an intermediate filament protein typically associated with autosomal dominant myofibrillar myopathy with cardiac involvement. However a few rare cases of autosomal recessive desminopathy are reported. In this family, a paternal missense p.(Leu190Pro) variant was viewed unlikely to be causative of autosomal dominant desminopathy, as the father and brothers carrying this variant were clinically unaffected. Clinical fit with a DES-related myopathy encouraged closer scrutiny of all DES variants, identifying a maternal deep intronic variant within intron-7, predicted to create a cryptic splice site, which segregated with disease. RNA sequencing and studies of muscle cDNA confirmed the deep intronic variant caused aberrant splicing of an artificial exon/intronic fragment into maternal DES mRNA transcripts, encoding a premature termination codon, and potently activating nonsense-mediate decay (92% paternal DES transcripts, 8% maternal). Western blot showed 60-75% reduction in desmin levels, likely comprised only of missense p.(Leu190Pro) desmin. Biopsy showed fibre size variation with increased central nuclei. Electron microscopy showed extensive myofibrillar disarray, duplication of the basal lamina, but no inclusions or aggregates. This study expands the phenotypic spectrum of recessive DES cardio/myopathy, and emphasizes the continuing importance of muscle biopsy for functional genomics pursuit of 'tricky' variants in neuromuscular conditions.Lisa G. Riley, Leigh B. Waddell, Roula Ghaoui, Frances J. Evesson, Beryl B. Cummings, Samantha J. Bryen, Himanshu Joshi, Min-Xia Wang, Susan Brammah, Leonard Kritharides, Alastair Corbett, Daniel G. MacArthur, Sandra T. Coope
    • …
    corecore