923 research outputs found

    NAPP: the Nucleic Acid Phylogenetic Profile Database

    Get PDF
    Nucleic acid phylogenetic profiling (NAPP) classifies coding and non-coding sequences in a genome according to their pattern of conservation across other genomes. This procedure efficiently distinguishes clusters of functional non-coding elements in bacteria, particularly small RNAs and cis-regulatory RNAs, from other conserved sequences. In contrast to other non-coding RNA detection pipelines, NAPP does not require the presence of conserved RNA secondary structure and therefore is likely to identify previously undetected RNA genes or elements. Furthermore, as NAPP clusters contain both coding and non-coding sequences with similar occurrence profiles, they can be analyzed under a functional perspective. We recently improved the NAPP pipeline and applied it to a collection of 949 bacterial and 68 archaeal species. The database and web interface available at http://napp.u-psud.fr/ enable detailed analysis of NAPP clusters enriched in non-coding RNAs, graphical display of phylogenetic profiles, visualization of predicted RNAs in their genome context and extraction of predicted RNAs for use with genome browsers or other softwar

    Computing expectation values for RNA motifs using discrete convolutions

    Get PDF
    BACKGROUND: Computational biologists use Expectation values (E-values) to estimate the number of solutions that can be expected by chance during a database scan. Here we focus on computing Expectation values for RNA motifs defined by single-strand and helix lod-score profiles with variable helix spans. Such E-values cannot be computed assuming a normal score distribution and their estimation previously required lengthy simulations. RESULTS: We introduce discrete convolutions as an accurate and fast mean to estimate score distributions of lod-score profiles. This method provides excellent score estimations for all single-strand or helical elements tested and also applies to the combination of elements into larger, complex, motifs. Further, the estimated distributions remain accurate even when pseudocounts are introduced into the lod-score profiles. Estimated score distributions are then easily converted into E-values. CONCLUSION: A good agreement was observed between computed E-values and simulations for a number of complete RNA motifs. This method is now implemented into the ERPIN software, but it can be applied as well to any search procedure based on ungapped profiles with statistically independent columns

    Conservation of alternative polyadenylation patterns in mammalian genes

    Get PDF
    BACKGROUND: Alternative polyadenylation is a widespread mechanism contributing to transcript diversity in eukaryotes. Over half of mammalian genes are alternatively polyadenylated. Our understanding of poly(A) site evolution is limited by the lack of a reliable identification of conserved, equivalent poly(A) sites among species. We introduce here a working definition of conserved poly(A) sites as sites that are both (i) properly aligned in human and mouse orthologous 3' untranslated regions (UTRs) and (ii) supported by EST or cDNA data in both species. RESULTS: We identified about 4800 such conserved poly(A) sites covering one third of the orthologous gene set studied. Characteristics of conserved poly(A) sites such as processing efficiency and tissue-specificity were analyzed. Conserved sites show a higher processing efficiency but no difference in tissular distribution when compared to non-conserved sites. In general, alternative poly(A) sites are species-specific and involve minor, non-conserved sites that are unlikely to play essential roles. However, there are about 500 genes with conserved tandem poly(A) sites. A significant fraction of these conserved tandems display a conserved arrangement of major/minor sites in their 3' UTR, suggesting that these alternative 3' ends may be under selection. CONCLUSION: This analysis allows us to identify potential functional alternative poly(A) sites and provides clues on the selective mechanisms at play in the appearance of multiple poly(A) sites and their maintenance in the 3' UTRs of genes

    Differential Repression of Alternative Transcripts: A Screen for miRNA Targets

    Get PDF
    Alternative polyadenylation sites produce transcript isoforms with 3′ untranslated regions (UTRs) of different lengths. If a microRNA (miRNA) target is present in the UTR, then only those target-containing isoforms should be sensitive to control by a cognate miRNA. We carried out a systematic examination of 3′ UTRs containing multiple poly(A) sites and putative miRNA targets. Based on expressed sequence tag (EST) counts and EST library information, we observed that levels of isoforms containing targets for miR-1 or miR-124, two miRNAs causing downregulation of transcript levels, were reduced in tissues expressing the corresponding miRNA. This analysis was repeated for all conserved 7-mers in 3′ UTRs, resulting in a selection of 312 motifs. We show that this set is significantly enriched in known miRNA targets and mRNA-destabilizing elements, which validates our initial hypothesis. We scanned the human genome for possible cognate miRNAs and identified phylogenetically conserved precursors matching our motifs. This analysis can help identify target-miRNA couples that went undetected in previous screens, but it may also reveal targets for other types of regulatory factors

    Nonsense-Mediated Decay Restricts LncRNA Levels in Yeast Unless Blocked by Double-Stranded RNA Structure

    Get PDF
    International audienceAntisense long non-coding (aslnc)RNAs represent a substantial part of eukaryotic transcriptomes that are, in yeast, controlled by the Xrn1 exonuclease. Nonsense-Mediated Decay (NMD) destabilizes the Xrn1-sensitive aslncRNAs (XUT), but what determines their sensitivity remains unclear. We report that 3′ single-stranded (3′-ss) extension mediates XUTs degradation by NMD, assisted by the Mtr4 and Dbp2 helicases. Single-gene investigation, genome-wide RNA analyses, and double-stranded (ds)RNA mapping revealed that 3′-ss extensions discriminate the NMD-targeted XUTs from stable lncRNAs. Ribosome profiling showed that XUT are translated, locking them for NMD activity. Interestingly, mutants of the Mtr4 and Dbp2 helicases accumulated XUTs, suggesting that dsRNA unwinding is a critical step for degradation. Indeed, expression of anticomplementary transcripts protects cryptic intergenic lncRNAs from NMD. Our results indicate that aslncRNAs form dsRNA that are only translated and targeted to NMD if dissociated by Mtr4 and Dbp2. We propose that NMD buffers genome expression by discarding pervasive regulatory transcripts
    corecore