Search CORE

44,357 research outputs found

Study of RNA Secondary Structure Prediction Algorithms

Author: Yu Lisa
Publication venue: SJSU ScholarWorks
Publication date: 01/01/2006
Field of study

Dynamic programming algorithms such as Nussinov algorithm and Zuker algorithm define criteria to search the most stable RNA secondary structures. Stochastic Context-Free Grammar (SCFG) predicts the most possible RNA secondary structure using context-free grammar and a defined set of probabilities for each grammar rule. These algorithms form the base of using computer programs to predict RNA secondary structures without pseudoknots. In this report, we review these RNA secondary structure prediction algorithms and present our own software implementations of these algorithms. The Nussinov algorithm is easy to understand. But our results show that the Nussinov algorithm is overly simplified and can not produce the most accurate result. The SCFG algorithm may be powerful. But its result is also inaccurate because there are no accurate probabilities for each corresponding grammar rule. The Zuker’s minimum free energy method incorporated far more biological knowledge in its energy definitions. Thus, its predictions are much better than the other two algorithms. Our implementations use both recursive and non-recursive function calls. Recursion is easy to understand, but recursion introduces significant overhead. We are able to rearrange the function calls to effectively stop the recursion. The non-recursion feature allows us to parallelize the most computing intensive part of the calculation. By abstracting a secondary structure to a tree representation and a string representation, we compared our prediction results with the results from experiment measurement or non-conventional general purpose computational methods, and results from popular package such as MFOLD. Our results also illustrate the limitation of these algorithms. The limitations clearly demonstrate that more biological and chemical knowledge of RNA need to be incorporated into the RNA secondary structure prediction algorithms

SJSU ScholarWorks

Recommended from our members

Computational Approaches for RNA Structure Prediction with Dynamic Programming and Deep Neural Networks

Author: Deng Dezhong
Publication venue: 'Oregon State University'
Publication date
Field of study

Our goal is to build a system to model the RNA sequences that reveals their structural information by using efficient dynamic programming algorithms and deep learning approaches. We aim to 1) achieve linear-time for RNA secondary structure prediction based on existing minimum free energy models; 2) utilize deep neural networks to learn high-level features directly from RNA sequences without looking at any indirect information from MFE models, in order to predict RNA secondary structures directly; 3) we also investigate RNA structure visualization approaches. Here we organize our line of research all the way from a novel annotated dataset to systematic RNA secondary structure prediction using deep learning, including bpRNA (a RNA structure annotation tool with its generated, large-scale RNA meta-database bpRNA-1m), LinearFold (a linear-time dynamic programming algorithm for RNA secondary structure prediction), DeepSloop (a deep learning approach that learns complex rules to detect stem-loop-forming RNA sequences), and DeepStructure (an end-to-end RNA secondary structure prediction approach via deep neural networks). We also presented bpRNA-Visual for RNA structure visualization purposes

ScholarsArchive@OSU

ShapeSorter: a fully probabilistic method for detecting conserved RNA structure features supported by SHAPE evidence

Author: Meyer Irmtraud M.
Tsybulskyi Volodymyr
Publication venue
Publication date: 01/01/2022
Field of study

There is an increased interest in the determination of RNA structures in vivo as it is now possible to probe them in a high-throughput manner, e.g. using SHAPE protocols. By now, there exist a range of computational methods that integrate experimental SHAPE-probing evidence into computational RNA secondary structure prediction. The state-of-the-art in this field is currently provided by computational methods that employ the minimum-free energy strategy for prediction RNA secondary structures with SHAPE-probing evidence. These methods, however, rely on the assumption that transcripts in vivo fold into the thermodynamically most stable configuration and ignore evolutionary evidence for conserved RNA structure features. We here present a new computational method, ShapeSorter, that predicts RNA structure features without employing the thermodynamic strategy. Instead, ShapeSorter employs a fully probabilistic framework to identify RNA structure features that are supported by evolutionary and SHAPE-probing evidence. Our method can capture RNA structure heterogeneity, pseudo-knotted RNA structures as well as transient and mutually exclusive RNA structure features. Moreover, it estimates P-values for the predicted RNA structure features which allows for easy filtering and ranking. We investigate the merits of our method in a comprehensive performance benchmarking and conclude that ShapeSorter has a significantly superior performance for predicting base-pairs than the existing state-of-the-art methods

Institutional Repository of the Freie Universität Berlin

PubMed Central

MDC Repository

RNA SECONDARY STRUCTURE PREDICTION TOOL

Author: Mali Meenakshee
Publication venue: SJSU ScholarWorks
Publication date: 01/04/2011
Field of study

Ribonucleic Acid (RNA) is one of the major macromolecules essential to all forms of life. Apart from the important role played in protein synthesis, it performs several important functions such as gene regulation, catalyst of biochemical reactions and modification of other RNAs. In some viruses, instead of DNA, RNA serves as the carrier of genetic information. RNA is an interesting subject of research in the scientific community. It has lead to important biological discoveries. One of the major problems researchers are trying to solve is the RNA structure prediction problem. It has been found that the structure of RNA is evolutionary conserved and it can help to determine the functions served by them. In this project, I will be developing a tool to predict the secondary structure of RNA using simulated annealing. The aim of this project is to understand in detail the simulated annealing algorithm and implement it to find solutions to RNA secondary structure. The results will be compared with the very famous tool Mfold, developed by Michael Zuker, using the minimum free energy approach

SJSU ScholarWorks

Computing the Partition Function for Kinetically Trapped RNA Secondary Structures

Author: Clote Peter
Lorenz William A.
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

An RNA secondary structure is locally optimal if there is no lower energy structure that can be obtained by the addition or removal of a single base pair, where energy is defined according to the widely accepted Turner nearest neighbor model. Locally optimal structures form kinetic traps, since any evolution away from a locally optimal structure must involve energetically unfavorable folding steps. Here, we present a novel, efficient algorithm to compute the partition function over all locally optimal secondary structures of a given RNA sequence. Our software, RNAlocopt runs in time and space. Additionally, RNAlocopt samples a user-specified number of structures from the Boltzmann subensemble of all locally optimal structures. We apply RNAlocopt to show that (1) the number of locally optimal structures is far fewer than the total number of structures – indeed, the number of locally optimal structures approximately equal to the square root of the number of all structures, (2) the structural diversity of this subensemble may be either similar to or quite different from the structural diversity of the entire Boltzmann ensemble, a situation that depends on the type of input RNA, (3) the (modified) maximum expected accuracy structure, computed by taking into account base pairing frequencies of locally optimal structures, is a more accurate prediction of the native structure than other current thermodynamics-based methods. The software RNAlocopt constitutes a technical breakthrough in our study of the folding landscape for RNA secondary structures. For the first time, locally optimal structures (kinetic traps in the Turner energy model) can be rapidly generated for long RNA sequences, previously impossible with methods that involved exhaustive enumeration. Use of locally optimal structure leads to state-of-the-art secondary structure prediction, as benchmarked against methods involving the computation of minimum free energy and of maximum expected accuracy. Web server and source code available at http://bioinformatics.bc.edu/clotelab/RNAlocopt/

CiteSeerX

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

e-RNA: a collection of web servers for comparative RNA structure prediction and visualisation

Author: Lai D.
Meyer I.M.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2014
Field of study

e-RNA offers a free and open-access collection of five published RNA sequence analysis tools, each solving specific problems not readily addressed by other available tools. Given multiple sequence alignments, Transat detects all conserved helices, including those expected in a final structure, but also transient, alternative and pseudo-knotted helices. RNA-Decoder uses unique evolutionary models to detect conserved RNA secondary structure in alignments which may be partly protein-coding. SimulFold simultaneously co-estimates the potentially pseudo-knotted conserved structure, alignment and phylogenetic tree for a set of homologous input sequences. CoFold predicts the minimum-free energy structure for an input sequence while taking the effects of co-transcriptional folding into account, thereby greatly improving the prediction accuracy for long sequences. R-chie is a program to visualise RNA secondary structures as arc diagrams, allowing for easy comparison and analysis of conserved base-pairs and quantitative features. The web site server dispatches user jobs to a cluster, where up to 100 jobs can be processed in parallel. Upon job completion, users can retrieve their results via a bookmarked or emailed link. e-RNA is located at http://www.e-rna.org

CiteSeerX

PubMed Central

MDC Repository

Recommended from our members

Mathematical and biological modelling of RNA secondary structure and its effects on gene expression.

Author: Hughes TA
McElwaine JN
Publication venue: Comput Math Methods Med
Publication date: 13/07/2017
Field of study

Secondary structures within the 5' untranslated regions of messenger RNAs can have profound effects on the efficiency of translation of their messages and thereby on gene expression. Consequently they can act as important regulatory motifs in both physiological and pathological settings. Current approaches to predicting the secondary structure of these RNA sequences find the structure with the global-minimum free energy. However, since RNA folds progressively from the 5' end when synthesised or released from the translational machinery, this may not be the most probable structure. We discuss secondary structure prediction based on local-minimisation of free energy with thermodynamic fluctuations as nucleotides are added to the 3' end and show that these can result in different secondary structures. We also discuss approaches for studying the extent of the translational inhibition specified by structures within the 5' untranslated region.Peer Reviewe

Apollo (Cambridge)

Investigating the concept of accessibility for predicting novel RNA-RNA interactions

Author: Meyer I.M.
Reißer S.
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 03/06/2021
Field of study

State-of-the-art methods for predicting novel trans RNA-RNA interactions use the so-called accessibility as key concept. It estimates whether a region in a given RNA sequence is accessible for forming trans interactions, using a thermodynamic model which quantifies its secondary structure features. RNA-RNA interactions are then predicted by finding the minimum free energy base pairing between the two transcripts, taking into account the accessibility as energy penalty. We investigated the underlying assumptions of this approach using the two methods RNAPLEX and INTARNA on two datasets, containing sRNA-mRNA and snoRNA-rRNA interactions, respectively. We find that (1) known trans RNA-RNA interactions frequently overlap regions containing RNA structure features, (2) the estimated accessibility reflects sRNA structures fairly well, but often disagrees with structures of longer transcripts, (3) the prediction performance of RNA-RNA interaction prediction methods is independent of the quality of the estimated accessibility profiles, and (4) one important overall effect of accessibility profiles is to prevent the thermodynamic model from predicting too long interactions. Based on our findings, we conclude that the accessibility concept to the minimum free energy approach to predicting novel RNA-RNA interactions has conceptual limitations and discuss potential ways of improving the field in the future

MDC Repository

RNA secondary sturcture prediction using a combined method of thermodynamics and kinetics

Author: Pan Minmin
Publication venue: Georgia Institute of Technology
Publication date: 07/07/2011
Field of study

Nowadays, RNA is extensively acknowledged an important role in the functions of information transfer, structural components, gene regulation and etc. The secondary structure of RNA becomes a key to understand structure-function relationship. Computational prediction of RNA secondary structure does not only provide possible structures, but also elucidates the mechanism of RNA folding. Conventional prediction programs are either derived from evolutionary perspective, or aimed to achieve minimum free energy. In vivo, RNA folds during transcription, which indicates that native RNA structure is a result from both thermodynamics and kinetics. In this thesis, I first reviewed the current leading kinetic folding programs and demonstrate that these programs are not able to predict secondary structure accurately. Upon that, I proposed a new sequential folding program called GTkinetics. Given an RNA sequence, GTkinetics predicts a secondary structure and a series of RNA folding trajectories. It treats the RNA as a growing chain, and adds stable local structures sequentially. It is featured with a Z-score to evaluate stability of local structures, which is able to locate native local structures with high confidence. Since all stable local structures are captured in GTkinetics, it results in some false positives, which prevents the native structure to form as the chain grows. This suggests a refolding model to melt the false positive hairpins, probable intermediate structures, and to fold the RNA into a new structure with reliable long-range helices. By analyzing suboptimal ensemble along the folding pathway, I suggested a refolding mechanism, with which refolding can be evaluated whether or not to take place. Another way to favor local structures over long-distance structures, we introduced a distance penalty function into the free energy calculation. I used a sigmoidal function to compute the energy penalty according to the distance in the primary sequence between two nucleotides of a base pair. For both the training dataset and the test dataset, the distance function improves the prediction to some extent. In order to characterize the differences between local and long-range helices, I carried out analysis of standardized local nucleotide composition and base pair composition according to the two groups. The results show that adenine accumulates on the 5' side of local structure, but not on that of long-range helices. GU base pairs occur significantly more frequent in the local helices than that in the long-range helices. These indicate that the mechanisms to form local and long range helices are different, which is encoded in the sequence itself. Based on all the results, I will draw conclusions and suggest future directions to enhance the current sequential folding program.MSCommittee Chair: Stephen Harvey; Committee Member: Heitsch, Christine; Committee Member: Hud, Nick; Committee Member: Wartell, Roger; Committee Member: Weitz, Joshu

Scholarly Materials And Research @ Georgia Tech

Sequence-structure relations of pseudoknot RNA

Author: Huang Fenix WD
Li Linda YM
Reidys Christian M
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The analysis of sequence-structure relations of RNA is based on a specific notion and folding of RNA structure. The notion of coarse grained structure employed here is that of canonical RNA pseudoknot contact-structures with at most two mutually crossing bonds (3-noncrossing). These structures are folded by a novel, <it>ab initio </it>prediction algorithm, cross, capable of searching all 3-noncrossing RNA structures. The algorithm outputs the minimum free energy structure. Results After giving some background on RNA pseudoknot structures and providing an outline of the folding algorithm being employed, we present in this paper various, statistical results on the mapping from RNA sequences into 3-noncrossing RNA pseudoknot structures. We study properties, like the fraction of pseudoknot structures, the dominant pseudoknot-shapes, neutral walks, neutral neighbors and local connectivity. We then put our results into context of molecular evolution of RNA. Conclusion Our results imply that, in analogy to RNA secondary structures, 3-noncrossing pseudoknot RNA represents a molecular phenotype that is well suited for molecular and in particular neutral evolution. We can conclude that extended, percolating neutral networks of pseudoknot RNA exist.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central