934 research outputs found
Premature terminator analysis sheds light on a hidden world of bacterial transcriptional attenuation
NAPP: the Nucleic Acid Phylogenetic Profile Database
Nucleic acid phylogenetic profiling (NAPP) classifies coding and non-coding sequences in a genome according to their pattern of conservation across other genomes. This procedure efficiently distinguishes clusters of functional non-coding elements in bacteria, particularly small RNAs and cis-regulatory RNAs, from other conserved sequences. In contrast to other non-coding RNA detection pipelines, NAPP does not require the presence of conserved RNA secondary structure and therefore is likely to identify previously undetected RNA genes or elements. Furthermore, as NAPP clusters contain both coding and non-coding sequences with similar occurrence profiles, they can be analyzed under a functional perspective. We recently improved the NAPP pipeline and applied it to a collection of 949 bacterial and 68 archaeal species. The database and web interface available at http://napp.u-psud.fr/ enable detailed analysis of NAPP clusters enriched in non-coding RNAs, graphical display of phylogenetic profiles, visualization of predicted RNAs in their genome context and extraction of predicted RNAs for use with genome browsers or other softwar
Computing expectation values for RNA motifs using discrete convolutions
BACKGROUND: Computational biologists use Expectation values (E-values) to estimate the number of solutions that can be expected by chance during a database scan. Here we focus on computing Expectation values for RNA motifs defined by single-strand and helix lod-score profiles with variable helix spans. Such E-values cannot be computed assuming a normal score distribution and their estimation previously required lengthy simulations. RESULTS: We introduce discrete convolutions as an accurate and fast mean to estimate score distributions of lod-score profiles. This method provides excellent score estimations for all single-strand or helical elements tested and also applies to the combination of elements into larger, complex, motifs. Further, the estimated distributions remain accurate even when pseudocounts are introduced into the lod-score profiles. Estimated score distributions are then easily converted into E-values. CONCLUSION: A good agreement was observed between computed E-values and simulations for a number of complete RNA motifs. This method is now implemented into the ERPIN software, but it can be applied as well to any search procedure based on ungapped profiles with statistically independent columns
Conservation of alternative polyadenylation patterns in mammalian genes
BACKGROUND: Alternative polyadenylation is a widespread mechanism contributing to transcript diversity in eukaryotes. Over half of mammalian genes are alternatively polyadenylated. Our understanding of poly(A) site evolution is limited by the lack of a reliable identification of conserved, equivalent poly(A) sites among species. We introduce here a working definition of conserved poly(A) sites as sites that are both (i) properly aligned in human and mouse orthologous 3' untranslated regions (UTRs) and (ii) supported by EST or cDNA data in both species. RESULTS: We identified about 4800 such conserved poly(A) sites covering one third of the orthologous gene set studied. Characteristics of conserved poly(A) sites such as processing efficiency and tissue-specificity were analyzed. Conserved sites show a higher processing efficiency but no difference in tissular distribution when compared to non-conserved sites. In general, alternative poly(A) sites are species-specific and involve minor, non-conserved sites that are unlikely to play essential roles. However, there are about 500 genes with conserved tandem poly(A) sites. A significant fraction of these conserved tandems display a conserved arrangement of major/minor sites in their 3' UTR, suggesting that these alternative 3' ends may be under selection. CONCLUSION: This analysis allows us to identify potential functional alternative poly(A) sites and provides clues on the selective mechanisms at play in the appearance of multiple poly(A) sites and their maintenance in the 3' UTRs of genes
Erratum: The Non-Coding RNA Journal Club: Highlights on Recent Papers-4. Non-Coding RNA 2016, 2, 9.
Please note that in the published editorial [1], affiliations 1, and 8 contained errors.[...]
Differential Repression of Alternative Transcripts: A Screen for miRNA Targets
Alternative polyadenylation sites produce transcript isoforms with 3′ untranslated regions (UTRs) of different lengths. If a microRNA (miRNA) target is present in the UTR, then only those target-containing isoforms should be sensitive to control by a cognate miRNA. We carried out a systematic examination of 3′ UTRs containing multiple poly(A) sites and putative miRNA targets. Based on expressed sequence tag (EST) counts and EST library information, we observed that levels of isoforms containing targets for miR-1 or miR-124, two miRNAs causing downregulation of transcript levels, were reduced in tissues expressing the corresponding miRNA. This analysis was repeated for all conserved 7-mers in 3′ UTRs, resulting in a selection of 312 motifs. We show that this set is significantly enriched in known miRNA targets and mRNA-destabilizing elements, which validates our initial hypothesis. We scanned the human genome for possible cognate miRNAs and identified phylogenetically conserved precursors matching our motifs. This analysis can help identify target-miRNA couples that went undetected in previous screens, but it may also reveal targets for other types of regulatory factors
Beyond the 3′ end: experimental validation of extended transcript isoforms
High throughput EST and full-length cDNA sequencing have revealed extensive variations at the 3′ ends of mammalian transcripts. Whether all of these changes are biologically meaningful has been the subject of controversy, as such, results may reflect in part transcription or polyadenylation leakage. We selected here a set of tandem poly(A) sites predicted from EST/cDNA sequence analysis that (i) are conserved between human and mouse, (ii) produce alternative 3′ isoforms with unusual size features and (iii) are not documented in current genome databases, and we submitted these sites to experimental validation in mouse tissues. Out of 86 tested poly(A) sites from 44 genes, 84 were individually confirmed using a specially devised RT-PCR strategy. We then focused on validating the exon structure between distant tandem poly(A) sites separated by over 3 kb, and between stop codons and alternative poly(A) sites located at 4.5 kb or more, using a long-distance RT-PCR strategy. In most cases, long transcripts spanning the whole poly(A)–poly(A) or stop-poly(A) distance were detected, confirming that tandem sites were part of the same transcription unit. Given the apparent conservation of these long alternative 3′ ends, different regulatory functions can be foreseen, depending on the location where transcription starts
Nonsense-Mediated Decay Restricts LncRNA Levels in Yeast Unless Blocked by Double-Stranded RNA Structure
International audienceAntisense long non-coding (aslnc)RNAs represent a substantial part of eukaryotic transcriptomes that are, in yeast, controlled by the Xrn1 exonuclease. Nonsense-Mediated Decay (NMD) destabilizes the Xrn1-sensitive aslncRNAs (XUT), but what determines their sensitivity remains unclear. We report that 3′ single-stranded (3′-ss) extension mediates XUTs degradation by NMD, assisted by the Mtr4 and Dbp2 helicases. Single-gene investigation, genome-wide RNA analyses, and double-stranded (ds)RNA mapping revealed that 3′-ss extensions discriminate the NMD-targeted XUTs from stable lncRNAs. Ribosome profiling showed that XUT are translated, locking them for NMD activity. Interestingly, mutants of the Mtr4 and Dbp2 helicases accumulated XUTs, suggesting that dsRNA unwinding is a critical step for degradation. Indeed, expression of anticomplementary transcripts protects cryptic intergenic lncRNAs from NMD. Our results indicate that aslncRNAs form dsRNA that are only translated and targeted to NMD if dissociated by Mtr4 and Dbp2. We propose that NMD buffers genome expression by discarding pervasive regulatory transcripts
- …