11 research outputs found

    The PETfold and PETcofold web servers for intra- and intermolecular structures of multiple RNA sequences

    Get PDF
    The function of non-coding RNA genes largely depends on their secondary structure and the interaction with other molecules. Thus, an accurate prediction of secondary structure and RNA–RNA interaction is essential for the understanding of biological roles and pathways associated with a specific RNA gene. We present web servers to analyze multiple RNA sequences for common RNA structure and for RNA interaction sites. The web servers are based on the recent PET (Probabilistic Evolutionary and Thermodynamic) models PETfold and PETcofold, but add user friendly features ranging from a graphical layer to interactive usage of the predictors. Additionally, the web servers provide direct access to annotated RNA alignments, such as the Rfam 10.0 database and multiple alignments of 16 vertebrate genomes with human. The web servers are freely available at: http://rth.dk/resources/petfold

    RNA secondary structure prediction from multi-aligned sequences

    Full text link
    It has been well accepted that the RNA secondary structures of most functional non-coding RNAs (ncRNAs) are closely related to their functions and are conserved during evolution. Hence, prediction of conserved secondary structures from evolutionarily related sequences is one important task in RNA bioinformatics; the methods are useful not only to further functional analyses of ncRNAs but also to improve the accuracy of secondary structure predictions and to find novel functional RNAs from the genome. In this review, I focus on common secondary structure prediction from a given aligned RNA sequence, in which one secondary structure whose length is equal to that of the input alignment is predicted. I systematically review and classify existing tools and algorithms for the problem, by utilizing the information employed in the tools and by adopting a unified viewpoint based on maximum expected gain (MEG) estimators. I believe that this classification will allow a deeper understanding of each tool and provide users with useful information for selecting tools for common secondary structure predictions.Comment: A preprint of an invited review manuscript that will be published in a chapter of the book `Methods in Molecular Biology'. Note that this version of the manuscript may differ from the published versio

    Incorporating phylogenetic-based covarying mutations into RNAalifold for RNA consensus structure prediction

    Get PDF
    Background: RNAalifold, a popular computational method for RNA consensus structure prediction, incorporates covarying mutations into a thermodynamic model to fold the aligned RNA sequences. When quantifying covariance, it evaluates conserved signals of two aligned columns with base-pairing rules. This scoring scheme performs better than some other approaches, such as mutual information. However it ignores the phylogenetic history of the aligned sequences, which is an important criterion to evaluate the level of sequence covariance. Results: In this article, in order to improve the accuracy of consensus structure folding, we propose a novel approach named PhyloRNAalifold. It incorporates the number of covarying mutations on the phylogenetic tree of the aligned sequences into the covariance scoring of RNAalifold. The benchmarking results show that the new scoring scheme of PhyloRNAalifold can improve the consensus structure detection of RNAalifold. Conclusion: Incorporating additional phylogenetic information of aligned sequences into the covariance scoring of RNAalifold can improve its performance of consensus structures folding. This improvement is correlated with alignment characteristics, such as pair-wise identity and the number of sequences in the alignment

    LaRA 2: parallel and vectorized program for sequence–structure alignment of RNA sequences

    Get PDF
    Background The function of non-coding RNA sequences is largely determined by their spatial conformation, namely the secondary structure of the molecule, formed by Watson–Crick interactions between nucleotides. Hence, modern RNA alignment algorithms routinely take structural information into account. In order to discover yet unknown RNA families and infer their possible functions, the structural alignment of RNAs is an essential task. This task demands a lot of computational resources, especially for aligning many long sequences, and it therefore requires efficient algorithms that utilize modern hardware when available. A subset of the secondary structures contains overlapping interactions (called pseudoknots), which add additional complexity to the problem and are often ignored in available software. Results We present the SeqAn-based software LaRA 2 that is significantly faster than comparable software for accurate pairwise and multiple alignments of structured RNA sequences. In contrast to other programs our approach can handle arbitrary pseudoknots. As an improved re-implementation of the LaRA tool for structural alignments, LaRA 2 uses multi-threading and vectorization for parallel execution and a new heuristic for computing a lower boundary of the solution. Our algorithmic improvements yield a program that is up to 130 times faster than the previous version. Conclusions With LaRA 2 we provide a tool to analyse large sets of RNA secondary structures in relatively short time, based on structural alignment. The produced alignments can be used to derive structural motifs for the search in genomic databases

    Transcripts with in silico predicted RNA structure are enriched everywhere in the mouse brain

    Get PDF
    BACKGROUND: Post-transcriptional control of gene expression is mostly conducted by specific elements in untranslated regions (UTRs) of mRNAs, in collaboration with specific binding proteins and RNAs. In several well characterized cases, these RNA elements are known to form stable secondary structures. RNA secondary structures also may have major functional implications for long noncoding RNAs (lncRNAs). Recent transcriptional data has indicated the importance of lncRNAs in brain development and function. However, no methodical efforts to investigate this have been undertaken. Here, we aim to systematically analyze the potential for RNA structure in brain-expressed transcripts. RESULTS: By comprehensive spatial expression analysis of the adult mouse in situ hybridization data of the Allen Mouse Brain Atlas, we show that transcripts (coding as well as non-coding) associated with in silico predicted structured probes are highly and significantly enriched in almost all analyzed brain regions. Functional implications of these RNA structures and their role in the brain are discussed in detail along with specific examples. We observe that mRNAs with a structure prediction in their UTRs are enriched for binding, transport and localization gene ontology categories. In addition, after manual examination we observe agreement between RNA binding protein interaction sites near the 3’ UTR structures and correlated expression patterns. CONCLUSIONS: Our results show a potential use for RNA structures in expressed coding as well as noncoding transcripts in the adult mouse brain, and describe the role of structured RNAs in the context of intracellular signaling pathways and regulatory networks. Based on this data we hypothesize that RNA structure is widely involved in transcriptional and translational regulatory mechanisms in the brain and ultimately plays a role in brain function

    LaRA 2: parallel and vectorized program for sequence–structure alignment of RNA sequences

    Get PDF
    Background The function of non-coding RNA sequences is largely determined by their spatial conformation, namely the secondary structure of the molecule, formed by Watson–Crick interactions between nucleotides. Hence, modern RNA alignment algorithms routinely take structural information into account. In order to discover yet unknown RNA families and infer their possible functions, the structural alignment of RNAs is an essential task. This task demands a lot of computational resources, especially for aligning many long sequences, and it therefore requires efficient algorithms that utilize modern hardware when available. A subset of the secondary structures contains overlapping interactions (called pseudoknots), which add additional complexity to the problem and are often ignored in available software. Results We present the SeqAn-based software LaRA 2 that is significantly faster than comparable software for accurate pairwise and multiple alignments of structured RNA sequences. In contrast to other programs our approach can handle arbitrary pseudoknots. As an improved re-implementation of the LaRA tool for structural alignments, LaRA 2 uses multi-threading and vectorization for parallel execution and a new heuristic for computing a lower boundary of the solution. Our algorithmic improvements yield a program that is up to 130 times faster than the previous version. Conclusions With LaRA 2 we provide a tool to analyse large sets of RNA secondary structures in relatively short time, based on structural alignment. The produced alignments can be used to derive structural motifs for the search in genomic databases

    LncRNAs in CONDBITs perspectives, from genetics towards theranostics

    Get PDF
    LncRNAs (Long noncoding RNAs) are novel group of ncRNAs and has been discovered to be pervasively transcripted in the genome, characterized as endogenous cellular RNAs consist of more than 200 nucleotides. They are ordered in view of function, transcript length, relation with protein-coding genes and other functional DNA elements, and subcellular localization. Theranostics is a novel study in medicine that combines specific targeted biomolecules based upon molecular-based test. As novel finding in the field of molecular medicine, lncRNA is indispensable tools in theranostics based medicine that allows specific targeting of molecular pathway for diagnostics and therapeutics. LncRNAs may execute as signals, decoys, guides, and scaffolds in their natural capacities. LncRNA expression is controlled by transcriptional and epigenetic factors and processes. LncRNAs also relate detracting biological programs. Here we reviewed lncRNAs in disorders/diseasest horoughly based on CONDBITs perspectives, i.e.: cardiology, oncology, neurology and neuroscience, dermatology, the biology of molecular and bioinformatics, immunology, and technologies (related with “-omics”; transcriptomics and “nano”; nanotechnology). It was narrated the lncRNA biomarkers that abundant in cardiovascular, neurodegenerative, dermatology, and immunology perspective. However, as cancer is the most widely studied disease, more biomarkers are available for this particular case. There are abundant cancer-associated lncRNAs. The most frequent learned lncRNA molecules in cancer are HOTAIR, MALAT1, LincRNA-p21, H19, GAS5, ANRIL, MEG3, XIST, HULC. LncRNAs in cancer diagnosis and monitoring, e.g.: H19 and AA174084 (gastric), HULC (hepatocellular), PCA3 (prostate). Prognostic lncRNAs, e.g.: HOTAIR and NKILA (breast), MEG3 (meningioma), NBAT-1 (neuroblastoma), SCHLAP1 (prostate). LncRNAs predicting therapeutic responsiveness, e.g.: CCAT1 (colorectal), HOTAIR (ovarian). Thus, it is concluded that the CONDBIT perspective is useful to describe the encouraging outlook of this transcriptomics-based medicinal approach

    How the initiating ribosome copes with ppGpp to translate mRNAs

    Get PDF
    During host colonization, bacteria use the alarmones (p)ppGpp to reshape their proteome by acting pleiotropically on DNA, RNA, and protein synthesis. Here, we elucidate how the initiating ribosome senses the cellular pool of guanosine nucleotides and regulates the progression towards protein synthesis. Our results show that the affinity of guanosine triphosphate (GTP) and the inhibitory concentration of ppGpp for the 30S-bound initiation factor IF2 vary depending on the programmed mRNA. The TufA mRNA enhanced GTP affinity for 30S complexes, resulting in improved ppGpp tolerance and allowing efficient protein synthesis. Conversely, the InfA mRNA allowed ppGpp to compete with GTP for IF2, thus stalling 30S complexes. Structural modeling and biochemical analysis of the TufA mRNA unveiled a structured enhancer of translation initiation (SETI) composed of two consecutive hairpins proximal to the translation initiation region (TIR) that largely account for ppGpp tolerance under physiological concentrations of guanosine nucleotides. Furthermore, our results show that the mechanism enhancing ppGpp tolerance is not restricted to the TufA mRNA, as similar ppGpp tolerance was found for the SETI-containing Rnr mRNA. Finally, we show that IF2 can use pppGpp to promote the formation of 30S initiation complexes (ICs), albeit requiring higher factor concentration and resulting in slower transitions to translation elongation. Altogether, our data unveil a novel regulatory mechanism at the onset of protein synthesis that tolerates physiological concentrations of ppGpp and that bacteria can exploit to modulate their proteome as a function of the nutritional shift happening during stringent response and infection.Russian Foundation for Basic ResearchRevisión por pare

    Computational Methods for Comparative Non-coding RNA Analysis: from Secondary Structures to Tertiary Structures

    Get PDF
    Unlike message RNAs (mRNAs) whose information is encoded in the primary sequences, the cellular roles of non-coding RNAs (ncRNAs) originate from the structures. Therefore studying the structural conservation in ncRNAs is important to yield an in-depth understanding of their functionalities. In the past years, many computational methods have been proposed to analyze the common structural patterns in ncRNAs using comparative methods. However, the RNA structural comparison is not a trivial task, and the existing approaches still have numerous issues in efficiency and accuracy. In this dissertation, we will introduce a suite of novel computational tools that extend the classic models for ncRNA secondary and tertiary structure comparisons. For RNA secondary structure analysis, we first developed a computational tool, named PhyloRNAalifold, to integrate the phylogenetic information into the consensus structural folding. The underlying idea of this algorithm is that the importance of a co-varying mutation should be determined by its position on the phylogenetic tree. By assigning high scores to the critical covariances, the prediction of RNA secondary structure can be more accurate. Besides structure prediction, we also developed a computational tool, named ProbeAlign, to improve the efficiency of genome-wide ncRNA screening by using high-throughput RNA structural probing data. It treats the chemical reactivities embedded in the probing information as pairing attributes of the searching targets. This approach can avoid the time-consuming base pair matching in the secondary structure alignment. The application of ProbeAlign to the FragSeq datasets shows its capability of genome-wide ncRNAs analysis. For RNA tertiary structure analysis, we first developed a computational tool, named STAR3D, to find the global conservation in RNA 3D structures. STAR3D aims at finding the consensus of stacks by using 2D topology and 3D geometry together. Then, the loop regions can be ordered and aligned according to their relative positions in the consensus. This stack-guided alignment method adopts the divide-and-conquer strategy into RNA 3D structural alignment, which has improved its efficiency dramatically. Furthermore, we also have clustered all loop regions in non-redundant RNA 3D structures to de novo detect plausible RNA structural motifs. The computational pipeline, named RNAMSC, was extended to handle large-scale PDB datasets, and solid downstream analysis was performed to ensure the clustering results are valid and easily to be applied to further research. The final results contain many interesting variations of known motifs, such as GNAA tetraloop, kink-turn, sarcin-ricin and t-loops. We also discovered novel functional motifs that conserved in a wide range of ncRNAs, including ribosomal RNA, sgRNA, SRP RNA, GlmS riboswitch and twister ribozyme
    corecore