84,799 research outputs found

    Prediction of RNA 3D Structure using Parallel Algorithm

    Get PDF
    RNA structures are generally organized hierarchically organized. Prediction of RNA structure is a big challenge for the researchers. Many prediction algorithms have been develop for RNA secondary structure prediction. But prediction of tertiary structure is still a big area of research. Approaches for RNA tertiary structure prediction are broadly categorized into two methods viz. de novo methods and template-based prediction method. The De novo are useful only for the small sequences of the nucleotides. For the sequences with large number of nucleotides template based modelling technique is more useful. These template based method are very common in protein structure prediction but it is not explored in RNA tertiary structure prediction. This paper presents a novel methodology for prediction of RNA tertiary structure using the bank ok known structure (template). First, the RNA secondary structure is analysed on the basis of free nucleotides available region wise which may contribute in the prediction of tertiary structure. Next similar region structure templates are searched in PDB metadata. These templates are then used to for the tertiary structure. As all the templates stored in PDB are predicted structures with minimum negative energy, the resultant tertiary structure is also consequently has minimum negative energy. The proposed method is able to predict even large ribosomal RNA structures. The experimental results have shown for the predicted structures on the basis of percentage similarity as well as time required for prediction

    e-RNA: a collection of web-servers for the prediction and visualisation of RNA secondary structure and their functional features

    Get PDF
    e-RNA is a collection of web-servers for the prediction and visualisation of RNA secondary structures and their functional features, including in particular RNA–RNA interactions. In this updated version, we have added novel tools for RNA secondary structure prediction and have significantly updated the visualisation functionality. The new method CoBold can identify transient RNA structure features and their potential functional effects on a known RNA structure during co-transcriptional structure formation. New tool ShapeSorter can predict evolutionarily conserved RNA secondary structure features while simultaneously taking experimental SHAPE probing evidence into account. The web-server R-Chie which visualises RNA secondary structure information in terms of arc diagrams, can now be used to also visualise and intuitively compare RNA–RNA, RNA–DNA and DNA–DNA interactions alongside multiple sequence alignments and quantitative information. The prediction generated by any method in e-RNA can be readily visualised on the web-server. For completed tasks, users can download their results and readily visualise them later on with R-Chie without having to re-run the predictions. e-RNA can be found at http://www.e-rna.org

    A Predictive Model for Secondary RNA Structure Using Graph Theory and a Neural Network

    Get PDF
    Background: Determining the secondary structure of RNA from the primary structure is a challenging computational problem. A number of algorithms have been developed to predict the secondary structure from the primary structure. It is agreed that there is still room for improvement in each of these approaches. In this work we build a predictive model for secondary RNA structure using a graph-theoretic tree representation of secondary RNA structure. We model the bonding of two RNA secondary structures to form a larger secondary structure with a graph operation we call merge. We consider all combinatorial possibilities using all possible tree inputs, both those that are RNA-like in structure and those that are not. The resulting data from each tree merge operation is represented by a vector. We use these vectors as input values for a neural network and train the network to recognize a tree as RNA-like or not, based on the merge data vector. The network estimates the probability of a tree being RNA-like.Results: The network correctly assigned a high probability of RNA-likeness to trees previously identified as RNA-like and a low probability of RNA-likeness to those classified as not RNA-like. We then used the neural network to predict the RNA-likeness of the unclassified trees.Conclusions: There are a number of secondary RNA structure prediction algorithms available online. These programs are based on finding the secondary structure with the lowest total free energy. In this work, we create a predictive tool for secondary RNA structures using graph-theoretic values as input for a neural network. The use of a graph operation to theoretically describe the bonding of secondary RNA is novel and is an entirely different approach to the prediction of secondary RNA structures. Our method correctly predicted trees to be RNA-like or not RNA-like for all known cases. In addition, our results convey a measure of likelihood that a tree is RNA-like or not RNA-like. Given that the majority of secondary RNA folding algorithms return more than one possible outcome, our method provides a means of determining the best or most likely structures among all of the possible outcomes

    A predictive model for secondary RNA structure using graph theory and a neural network

    Get PDF
    Background: Determining the secondary structure of RNA from the primary structure is a challenging computational problem. A number of algorithms have been developed to predict the secondary structure from the primary structure. It is agreed that there is still room for improvement in each of these approaches. In this work we build a predictive model for secondary RNA structure using a graph-theoretic tree representation of secondary RNA structure. We model the bonding of two RNA secondary structures to form a larger secondary structure with a graph operation we call merge. We consider all combinatorial possibilities using all possible tree inputs, both those that are RNA-like in structure and those that are not. The resulting data from each tree merge operation is represented by a vector. We use these vectors as input values for a neural network and train the network to recognize a tree as RNA-like or not, based on the merge data vector. The network estimates the probability of a tree being RNA-like.Results: The network correctly assigned a high probability of RNA-likeness to trees previously identified as RNA-like and a low probability of RNA-likeness to those classified as not RNA-like. We then used the neural network to predict the RNA-likeness of the unclassified trees.Conclusions: There are a number of secondary RNA structure prediction algorithms available online. These programs are based on finding the secondary structure with the lowest total free energy. In this work, we create a predictive tool for secondary RNA structures using graph-theoretic values as input for a neural network. The use of a graph operation to theoretically describe the bonding of secondary RNA is novel and is an entirely different approach to the prediction of secondary RNA structures. Our method correctly predicted trees to be RNA-like or not RNA-like for all known cases. In addition, our results convey a measure of likelihood that a tree is RNA-like or not RNA-like. Given that the majority of secondary RNA folding algorithms return more than one possible outcome, our method provides a means of determining the best or most likely structures among all of the possible outcomes

    A New Method of RNA Secondary Structure Prediction Based on Convolutional Neural Network and Dynamic Programming

    Get PDF
    In recent years, obtaining RNA secondary structure information has played an important role in RNA and gene function research. Although some RNA secondary structures can be gained experimentally, in most cases, efficient, and accurate computational methods are still needed to predict RNA secondary structure. Current RNA secondary structure prediction methods are mainly based on the minimum free energy algorithm, which finds the optimal folding state of RNA in vivo using an iterative method to meet the minimum energy or other constraints. However, due to the complexity of biotic environment, a true RNA structure always keeps the balance of biological potential energy status, rather than the optimal folding status that meets the minimum energy. For short sequence RNA its equilibrium energy status for the RNA folding organism is close to the minimum free energy status; therefore, the minimum free energy algorithm for predicting RNA secondary structure has higher accuracy. Nevertheless, in a longer sequence RNA, constant folding causes its biopotential energy balance to deviate far from the minimum free energy status. This deviation is because of its complex structure and results in a serious decline in the prediction accuracy of its secondary structure. In this paper, we propose a novel RNA secondary structure prediction algorithm using a convolutional neural network model combined with a dynamic programming method to improve the accuracy with large-scale RNA sequence and structure data. We analyze current experimental RNA sequences and structure data to construct a deep convolutional network model, and then we extract implicit features of an effective classification from large-scale data to predict the pairing probability of each base in an RNA sequence. For the obtained probabilities of RNA sequence base pairing, an enhanced dynamic programming method is applied to obtain the optimal RNA secondary structure. Results indicate that our proposed method is superior to the common RNA secondary structure prediction algorithms in predicting three benchmark RNA families. Based on the characteristics of deep learning algorithm, it can be inferred that the method proposed in this paper has a 30% higher prediction success rate when compared with other algorithms, which will be needed as the amount of real RNA structure data increases in the future

    e-RNA: a collection of web-servers for the prediction and visualisation of RNA secondary structure and their functional features

    Get PDF
    e-RNA is a collection of web-servers for the prediction and visualisation of RNA secondary structures and their functional features, including in particular RNA-RNA interactions. In this updated version, we have added novel tools for RNA secondary structure prediction and have significantly updated the visualisation functionality. The new method CoBold can identify transient RNA structure features and their potential functional effects on a known RNA structure during co-transcriptional structure formation. New tool ShapeSorter can predict evolutionarily conserved RNA secondary structure features while simultaneously taking experimental SHAPE probing evidence into account. The web-server R-Chie which visualises RNA secondary structure information in terms of arc diagrams, can now be used to also visualise and intuitively compare RNA-RNA, RNA-DNA and DNA-DNA interactions alongside multiple sequence alignments and quantitative information. The prediction generated by any method in e-RNA can be readily visualised on the web-server. For completed tasks, users can download their results and readily visualise them later on with R-Chie without having to re-run the predictions. e-RNA can be found at http://www.e-rna.org

    DotKnot: pseudoknot prediction using the probability dot plot under a refined energy model

    Get PDF
    RNA pseudoknots are functional structure elements with key roles in viral and cellular processes. Prediction of a pseudoknotted minimum free energy structure is an NP-complete problem. Practical algorithms for RNA structure prediction including restricted classes of pseudoknots suffer from high runtime and poor accuracy for longer sequences. A heuristic approach is to search for promising pseudoknot candidates in a sequence and verify those. Afterwards, the detected pseudoknots can be further analysed using bioinformatics or laboratory techniques. We present a novel pseudoknot detection method called DotKnot that extracts stem regions from the secondary structure probability dot plot and assembles pseudoknot candidates in a constructive fashion. We evaluate pseudoknot free energies using novel parameters, which have recently become available. We show that the conventional probability dot plot makes a wide class of pseudoknots including those with bulged stems manageable in an explicit fashion. The energy parameters now become the limiting factor in pseudoknot prediction. DotKnot is an efficient method for long sequences, which finds pseudoknots with higher accuracy compared to other known prediction algorithms. DotKnot is accessible as a web server at http://dotknot.csse.uwa.edu.au

    SimulFold: Simultaneously Inferring RNA Structures Including Pseudoknots, Alignments, and Trees Using a Bayesian MCMC Framework

    Get PDF
    Computational methods for predicting evolutionarily conserved rather than thermodynamic RNA structures have recently attracted increased interest. These methods are indispensable not only for elucidating the regulatory roles of known RNA transcripts, but also for predicting RNA genes. It has been notoriously difficult to devise them to make the best use of the available data and to predict high-quality RNA structures that may also contain pseudoknots. We introduce a novel theoretical framework for co-estimating an RNA secondary structure including pseudoknots, a multiple sequence alignment, and an evolutionary tree, given several RNA input sequences. We also present an implementation of the framework in a new computer program, called SimulFold, which employs a Bayesian Markov chain Monte Carlo method to sample from the joint posterior distribution of RNA structures, alignments, and trees. We use the new framework to predict RNA structures, and comprehensively evaluate the quality of our predictions by comparing our results to those of several other programs. We also present preliminary data that show SimulFold's potential as an alignment and phylogeny prediction method. SimulFold overcomes many conceptual limitations that current RNA structure prediction methods face, introduces several new theoretical techniques, and generates high-quality predictions of conserved RNA structures that may include pseudoknots. It is thus likely to have a strong impact, both on the field of RNA structure prediction and on a wide range of data analyses

    TurboFold: Iterative probabilistic estimation of secondary structures for multiple RNA sequences

    Get PDF
    Abstract Background The prediction of secondary structure, i.e. the set of canonical base pairs between nucleotides, is a first step in developing an understanding of the function of an RNA sequence. The most accurate computational methods predict conserved structures for a set of homologous RNA sequences. These methods usually suffer from high computational complexity. In this paper, TurboFold, a novel and efficient method for secondary structure prediction for multiple RNA sequences, is presented. Results TurboFold takes, as input, a set of homologous RNA sequences and outputs estimates of the base pairing probabilities for each sequence. The base pairing probabilities for a sequence are estimated by combining intrinsic information, derived from the sequence itself via the nearest neighbor thermodynamic model, with extrinsic information, derived from the other sequences in the input set. For a given sequence, the extrinsic information is computed by using pairwise-sequence-alignment-based probabilities for co-incidence with each of the other sequences, along with estimated base pairing probabilities, from the previous iteration, for the other sequences. The extrinsic information is introduced as free energy modifications for base pairing in a partition function computation based on the nearest neighbor thermodynamic model. This process yields updated estimates of base pairing probability. The updated base pairing probabilities in turn are used to recompute extrinsic information, resulting in the overall iterative estimation procedure that defines TurboFold. TurboFold is benchmarked on a number of ncRNA datasets and compared against alternative secondary structure prediction methods. The iterative procedure in TurboFold is shown to improve estimates of base pairing probability with each iteration, though only small gains are obtained beyond three iterations. Secondary structures composed of base pairs with estimated probabilities higher than a significance threshold are shown to be more accurate for TurboFold than for alternative methods that estimate base pairing probabilities. TurboFold-MEA, which uses base pairing probabilities from TurboFold in a maximum expected accuracy algorithm for secondary structure prediction, has accuracy comparable to the best performing secondary structure prediction methods. The computational and memory requirements for TurboFold are modest and, in terms of sequence length and number of sequences, scale much more favorably than joint alignment and folding algorithms. Conclusions TurboFold is an iterative probabilistic method for predicting secondary structures for multiple RNA sequences that efficiently and accurately combines the information from the comparative analysis between sequences with the thermodynamic folding model. Unlike most other multi-sequence structure prediction methods, TurboFold does not enforce strict commonality of structures and is therefore useful for predicting structures for homologous sequences that have diverged significantly. TurboFold can be downloaded as part of the RNAstructure package at http://rna.urmc.rochester.edu.</p

    PARTS: Probabilistic Alignment for RNA joinT Secondary structure prediction

    Get PDF
    A novel method is presented for joint prediction of alignment and common secondary structures of two RNA sequences. The joint consideration of common secondary structures and alignment is accomplished by structural alignment over a search space defined by the newly introduced motif called matched helical regions. The matched helical region formulation generalizes previously employed constraints for structural alignment and thereby better accommodates the structural variability within RNA families. A probabilistic model based on pseudo free energies obtained from precomputed base pairing and alignment probabilities is utilized for scoring structural alignments. Maximum a posteriori (MAP) common secondary structures, sequence alignment and joint posterior probabilities of base pairing are obtained from the model via a dynamic programming algorithm called PARTS. The advantage of the more general structural alignment of PARTS is seen in secondary structure predictions for the RNase P family. For this family, the PARTS MAP predictions of secondary structures and alignment perform significantly better than prior methods that utilize a more restrictive structural alignment model. For the tRNA and 5S rRNA families, the richer structural alignment model of PARTS does not offer a benefit and the method therefore performs comparably with existing alternatives. For all RNA families studied, the posterior probability estimates obtained from PARTS offer an improvement over posterior probability estimates from a single sequence prediction. When considering the base pairings predicted over a threshold value of confidence, the combination of sensitivity and positive predictive value is superior for PARTS than for the single sequence prediction. PARTS source code is available for download under the GNU public license at http://rna.urmc.rochester.edu
    corecore