56 research outputs found

    Long noncoding RNAs are rarely translated in two human cell lines

    Get PDF
    Data from the Encyclopedia of DNA Elements (ENCODE) project show over 9640 human genome loci classified as long noncoding RNAs (lncRNAs), yet only ∼100 have been deeply characterized to determine their role in the cell. To measure the protein-coding output from these RNAs, we jointly analyzed two recent data sets produced in the ENCODE project: tandem mass spectrometry (MS/MS) data mapping expressed peptides to their encoding genomic loci, and RNA-seq data generated by ENCODE in long polyA+ and polyA− fractions in the cell lines K562 and GM12878. We used the machine-learning algorithm RuleFit3 to regress the peptide data against RNA expression data. The most important covariate for predicting translation was, surprisingly, the Cytosol polyA− fraction in both cell lines. LncRNAs are ∼13-fold less likely to produce detectable peptides than similar mRNAs, indicating that ∼92% of GENCODE v7 lncRNAs are not translated in these two ENCODE cell lines. Intersecting 9640 lncRNA loci with 79,333 peptides yielded 85 unique peptides matching 69 lncRNAs. Most cases were due to a coding transcript misannotated as lncRNA. Two exceptions were an unprocessed pseudogene and a bona fide lncRNA gene, both with open reading frames (ORFs) compromised by upstream stop codons. All potentially translatable lncRNA ORFs had only a single peptide match, indicating low protein abundance and/or false-positive peptide matches. We conclude that with very few exceptions, ribosomes are able to distinguish coding from noncoding transcripts and, hence, that ectopic translation and cryptic mRNAs are rare in the human lncRNAome

    Binding to SMN2 pre-mRNA-protein complex elicits specificity for small molecule splicing modifiers

    Get PDF
    Small molecule splicing modifiers have been previously described that target the general splicing machinery and thus have low specificity for individual genes. Several potent molecules correcting the splicing deficit of the SMN2 (survival of motor neuron 2) gene have been identified and these molecules are moving towards a potential therapy for spinal muscular atrophy (SMA). Here by using a combination of RNA splicing, transcription, and protein chemistry techniques, we show that these molecules directly bind to two distinct sites of the SMN2 pre-mRNA, thereby stabilizing a yet unidentified ribonucleoprotein (RNP) complex that is critical to the specificity of these small molecules for SMN2 over other genes. In addition to the therapeutic potential of these molecules for treatment of SMA, our work has wide-ranging implications in understanding how small molecules can interact with specific quaternary RNA structures

    Classification and function of small open reading frames

    Get PDF
    Small open reading frames (smORFs) of 100 codons or fewer are usually - if arbitrarily - excluded from proteome annotations. Despite this, the genomes of many metazoans, including humans, contain millions of smORFs, some of which fulfil key physiological functions. Recently, the transcriptome of Drosophila melanogaster was shown to contain thousands of smORFs of different classes that actively undergo translation, which produces peptides of mostly unknown function. Here, we present a comprehensive analysis of smORFs in flies, mice and humans. We propose the existence of several functional classes of smORFs, ranging from inert DNA sequences to transcribed and translated cis-regulators of translation and peptides with a propensity to function as regulators of membrane-associated proteins, or as components of ancient protein complexes in the cytoplasm. We suggest that the different smORF classes could represent steps in gene, peptide and protein evolution. Our analysis introduces a distinction between different peptide-coding classes of smORFs in animal genomes, and highlights the role of model organisms for the study of small peptide biology in the context of development, physiology and human disease

    A Flexible K-12 Weather Data Collection and Education Program

    Get PDF
    The Nebraska Earth Science Education Network (NESEN) is an organization within the University of Nebraska-Lincoln whose objectives are to: 1) promote and enhance K-12 earth science education in Nebraska, 2) improve teacher knowledge and understanding so that students become better informed about the complexities of environmental and natural resources issues and 3) enhance the transfer of earth science information to the K-12 teaching community (Gosselin, Mohlman, Mesarch & Meyer, 1996; Gosselin et al., 1999). To achieve this last objective NESEN developed the Students and Teachers Exchanging Data, Information and Ideas (STEDII) program with the help of support from the National Aeronautic and Space Administration (NASA) and the Department of Energy’s National Institute for Global and Environmental Change (NIGEC). The initial focus of STEDII was to use the collection of weather data as a mechanism to promote the sharing of data and information between eight schools involved in an electronic communication project funded by NASA (Gosselin et al., 1999). The topic of weather was chosen because students experience weather everyday, weather is relevant to students\u27 lives in an agricultural based state (Williams, 1992), weather is quite variable in Nebraska (NebraskaLand, 1996) and weather is part of most school systems\u27 curriculum. The STEDII project has provided students and teachers with basic weather instrumentation, instruction on how to use these instruments, lessons on weather topics and a website by which schools can share data by submitting and retrieving measurements from a centralized data base
    • …
    corecore