216 research outputs found

    INFO-RNA—a server for fast inverse RNA folding satisfying sequence constraints

    Get PDF
    INFO-RNA is a new web server for designing RNA sequences that fold into a user given secondary structure. Furthermore, constraints on the sequence can be specified, e.g. one can restrict sequence positions to a fixed nucleotide or to a set of nucleotides. Moreover, the user can allow violations of the constraints at some positions, which can be advantageous in complicated cases

    RNPomics: Defining the ncRNA transcriptome by cDNA library generation from ribonucleo-protein particles

    Get PDF
    Up to 450 000 non-coding RNAs (ncRNAs) have been predicted to be transcribed from the human genome. However, it still has to be elucidated which of these transcripts represent functional ncRNAs. Since all functional ncRNAs in Eukarya form ribonucleo-protein particles (RNPs), we generated specialized cDNA libraries from size-fractionated RNPs and validated the presence of selected ncRNAs within RNPs by glycerol gradient centrifugation. As a proof of concept, we applied the RNP method to human Hela cells or total mouse brain, and subjected cDNA libraries, generated from the two model systems, to deep-sequencing. Bioinformatical analysis of cDNA sequences revealed several hundred ncRNP candidates. Thereby, ncRNAs candidates were mainly located in intergenic as well as intronic regions of the genome, with a significant overrepresentation of intron-derived ncRNA sequences. Additionally, a number of ncRNAs mapped to repetitive sequences. Thus, our RNP approach provides an efficient way to identify new functional small ncRNA candidates, involved in RNP formation

    Identification of small non-coding RNAs from mitochondria and chloroplasts

    Get PDF
    Small non-protein-coding RNAs (ncRNAs) have been identified in a wide spectrum of organisms ranging from bacteria to humans. In eukarya, systematic searches for ncRNAs have so far been restricted to the nuclear or cytosolic compartments of cells. Whether or not small stable non-coding RNA species also exist in cell organelles, in addition to tRNAs or ribosomal RNAs, is unknown. We have thus generated cDNA libraries from size-selected mammalian mitochondrial RNA and plant chloroplast RNA and searched for small ncRNA species in these two types of DNA-containing cell organelles. In total, we have identified 18 novel candidates for organellar ncRNAs in these two cellular compartments and confirmed expression of six of them by northern blot analysis or RNase A protection assays. Most candidate ncRNA genes map to intergenic regions of the organellar genomes. As found previously in bacteria, the presumptive ancestors of present-day chloroplasts and mitochondria, we also observed examples of antisense ncRNAs that potentially could target organelle-encoded mRNAs. The structural features of the identified ncRNAs as well as their possible cellular functions are discussed. The absence from our libraries of abundant small RNA species that are not encoded by the organellar genomes suggests that the import of RNAs into cell organelles is of very limited significance or does not occur at all

    ADAR2-mediated editing of RNA substrates in the nucleolus is inhibited by C/D small nucleolar RNAs

    Get PDF
    Posttranscriptional, site-specific adenosine to inosine (A-to-I) base conversions, designated as RNA editing, play significant roles in generating diversity of gene expression. However, little is known about how and in which cellular compartments RNA editing is controlled. Interestingly, the two enzymes that catalyze RNA editing, adenosine deaminases that act on RNA (ADAR) 1 and 2, have recently been demonstrated to dynamically associate with the nucleolus. Moreover, we have identified a brain-specific small RNA, termed MBII-52, which was predicted to function as a nucleolar C/D RNA, thereby targeting an A-to-I editing site (C-site) within the 5-HT2C serotonin receptor pre-mRNA for 2′-O-methylation. Through the subcellular targeting of minigenes that contain natural editing sites, we show that ADAR2- but not ADAR1-mediated RNA editing occurs in the nucleolus. We also demonstrate that MBII-52 forms a bona fide small nucleolar ribonucleoprotein particle that specifically decreases the efficiency of RNA editing by ADAR2 at the targeted C-site. Our data are consistent with a model in which C/D small nucleolar RNA might play a role in the regulation of RNA editing

    Revealing stable processing products from ribosome-associated small RNAs by deep-sequencing data analysis

    Get PDF
    The exploration of the non-protein-coding RNA (ncRNA) transcriptome is currently focused on profiling of microRNA expression and detection of novel ncRNA transcription units. However, recent studies suggest that RNA processing can be a multi-layer process leading to the generation of ncRNAs of diverse functions from a single primary transcript. Up to date no methodology has been presented to distinguish stable functional RNA species from rapidly degraded side products of nucleases. Thus the correct assessment of widespread RNA processing events is one of the major obstacles in transcriptome research. Here, we present a novel automated computational pipeline, named APART, providing a complete workflow for the reliable detection of RNA processing products from next-generation-sequencing data. The major features include efficient handling of non-unique reads, detection of novel stable ncRNA transcripts and processing products and annotation of known transcripts based on multiple sources of information. To disclose the potential of APART, we have analyzed a cDNA library derived from small ribosome-associated RNAs in Saccharomyces cerevisiae. By employing the APART pipeline, we were able to detect and confirm by independent experimental methods multiple novel stable RNA molecules differentially processed from well known ncRNAs, like rRNAs, tRNAs or snoRNAs, in a stress-dependent manner

    Combined experimental and computational approach to identify non-protein-coding RNAs in the deep-branching eukaryote Giardia intestinalis

    Get PDF
    Non-protein-coding RNAs represent a large proportion of transcribed sequences in eukaryotes. These RNAs often function in large RNA–protein complexes, which are catalysts in various RNA-processing pathways. As RNA processing has become an increasingly important area of research, numerous non-messenger RNAs have been uncovered in all the model eukaryotic organisms. However, knowledge on RNA processing in deep-branching eukaryotes is still limited. This study focuses on the identification of non-protein-coding RNAs from the diplomonad parasite Giardia intestinalis, showing that a combined experimental and computational search strategy is a fast method of screening reduced or compact genomes. The analysis of our Giardia cDNA library has uncovered 31 novel candidates, including C/D-box and H/ACA box snoRNAs, as well as an unusual transcript of RNase P, and double-stranded RNAs. Subsequent computational analysis has revealed additional putative C/D-box snoRNAs. Our results will lead towards a future understanding of RNA metabolism in the deep-branching eukaryote Giardia, as more ncRNAs are characterized

    Lightweight comparison of RNAs based on exact sequence–structure matches

    Get PDF
    Motivation: Specific functions of ribonucleic acid (RNA) molecules are often associated with different motifs in the RNA structure. The key feature that forms such an RNA motif is the combination of sequence and structure properties. In this article, we introduce a new RNA sequence–structure comparison method which maintains exact matching substructures. Existing common substructures are treated as whole unit while variability is allowed between such structural motifs

    Expression and Processing of a Small Nucleolar RNA from the Epstein-Barr Virus Genome

    Get PDF
    Small nucleolar RNAs (snoRNAs) are localized within the nucleolus, a sub-nuclear compartment, in which they guide ribosomal or spliceosomal RNA modifications, respectively. Up until now, snoRNAs have only been identified in eukaryal and archaeal genomes, but are notably absent in bacteria. By screening B lymphocytes for expression of non-coding RNAs (ncRNAs) induced by the Epstein-Barr virus (EBV), we here report, for the first time, the identification of a snoRNA gene within a viral genome, designated as v-snoRNA1. This genetic element displays all hallmark sequence motifs of a canonical C/D box snoRNA, namely C/C′- as well as D/D′-boxes. The nucleolar localization of v-snoRNA1 was verified by in situ hybridisation of EBV-infected cells. We also confirmed binding of the three canonical snoRNA proteins, fibrillarin, Nop56 and Nop58, to v-snoRNA1. The C-box motif of v-snoRNA1 was shown to be crucial for the stability of the viral snoRNA; its selective deletion in the viral genome led to a complete down-regulation of v-snoRNA1 expression levels within EBV-infected B cells. We further provide evidence that v-snoRNA1 might serve as a miRNA-like precursor, which is processed into 24 nt sized RNA species, designated as v-snoRNA124pp. A potential target site of v-snoRNA124pp was identified within the 3′-UTR of BALF5 mRNA which encodes the viral DNA polymerase. V-snoRNA1 was found to be expressed in all investigated EBV-positive cell lines, including lymphoblastoid cell lines (LCL). Interestingly, induction of the lytic cycle markedly up-regulated expression levels of v-snoRNA1 up to 30-fold. By a computational approach, we identified a v-snoRNA1 homolog in the rhesus lymphocryptovirus genome. This evolutionary conservation suggests an important role of v-snoRNA1 during γ-herpesvirus infection

    A framework for automated enrichment of functionally significant inverted repeats in whole genomes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>RNA transcripts from genomic sequences showing dyad symmetry typically adopt hairpin-like, cloverleaf, or similar structures that act as recognition sites for proteins. Such structures often are the precursors of non-coding RNA (ncRNA) sequences like microRNA (miRNA) and small-interfering RNA (siRNA) that have recently garnered more functional significance than in the past. Genomic DNA contains hundreds of thousands of such inverted repeats (IRs) with varying degrees of symmetry. But by collecting statistically significant information from a known set of ncRNA, we can sort these IRs into those that are likely to be functional.</p> <p>Results</p> <p>A novel method was developed to scan genomic DNA for partially symmetric inverted repeats and the resulting set was further refined to match miRNA precursors (pre-miRNA) with respect to their density of symmetry, statistical probability of the symmetry, length of stems in the predicted hairpin secondary structure, and the GC content of the stems. This method was applied on the <it>Arabidopsis thaliana</it> genome and validated against the set of 190 known Arabidopsis pre-miRNA in the miRBase database. A preliminary scan for IRs identified 186 of the known pre-miRNA but with 714700 pre-miRNA candidates. This large number of IRs was further refined to 483908 candidates with 183 pre-miRNA identified and further still to 165371 candidates with 171 pre-miRNA identified (i.e. with 90% of the known pre-miRNA retained).</p> <p>Conclusions</p> <p>165371 candidates for potentially functional miRNA is still too large a set to warrant wet lab analyses, such as northern blotting, on all of them. Hence additional filters are needed to further refine the number of candidates while still retaining most of the known miRNA. These include detection of promoters and terminators, homology analyses, location of candidate relative to coding regions, and better secondary structure prediction algorithms. The software developed is designed to easily accommodate such additional filters with a minimal experience in Perl.</p

    Transcriptome annotation using tandem SAGE tags

    Get PDF
    Analysis of several million expressed gene signatures (tags) revealed an increasing number of different sequences, largely exceeding that of annotated genes in mammalian genomes. Serial analysis of gene expression (SAGE) can reveal new Poly(A) RNAs transcribed from previously unrecognized chromosomal regions. However, conventional SAGE tags are too short to identify unambiguously unique sites in large genomes. Here, we design a novel strategy with tags anchored on two different restrictions sites of cDNAs. New transcripts are then tentatively defined by the two SAGE tags in tandem and by the spanning sequence read on the genome between these tagged sites. Having developed a new algorithm to locate these tag-delimited genomic sequences (TDGS), we first validated its capacity to recognize known genes and its ability to reveal new transcripts with two SAGE libraries built in parallel from a single RNA sample. Our algorithm proves fast enough to experiment this strategy at a large scale. We then collected and processed the complete sets of human SAGE tags to predict yet unknown transcripts. A cross-validation with tiling arrays data shows that 47% of these TDGS overlap transcriptional active regions. Our method provides a new and complementary approach for complex transcriptome annotation
    corecore