76,808 research outputs found
Fast prediction of RNA-RNA interaction
<p>Abstract</p> <p>Background</p> <p>Regulatory antisense RNAs are a class of ncRNAs that regulate gene expression by prohibiting the translation of an mRNA by establishing stable interactions with a target sequence. There is great demand for efficient computational methods to predict the specific interaction between an ncRNA and its target mRNA(s). There are a number of algorithms in the literature which can predict a variety of such interactions - unfortunately at a very high computational cost. Although some existing target prediction approaches are much faster, they are specialized for interactions with a single binding site.</p> <p>Methods</p> <p>In this paper we present a novel algorithm to accurately predict the minimum free energy structure of RNA-RNA interaction under the most general type of interactions studied in the literature. Moreover, we introduce a fast heuristic method to predict the specific (multiple) binding sites of two interacting RNAs.</p> <p>Results</p> <p>We verify the performance of our algorithms for joint structure and binding site prediction on a set of known interacting RNA pairs. Experimental results show our algorithms are highly accurate and outperform all competitive approaches.</p
An Efficient Algorithm for Upper Bound on the Partition Function of Nucleic Acids
It has been shown that minimum free energy structure for RNAs and RNA-RNA
interaction is often incorrect due to inaccuracies in the energy parameters and
inherent limitations of the energy model. In contrast, ensemble based
quantities such as melting temperature and equilibrium concentrations can be
more reliably predicted. Even structure prediction by sampling from the
ensemble and clustering those structures by Sfold [7] has proven to be more
reliable than minimum free energy structure prediction. The main obstacle for
ensemble based approaches is the computational complexity of the partition
function and base pairing probabilities. For instance, the space complexity of
the partition function for RNA-RNA interaction is and the time
complexity is which are prohibitively large [4,12]. Our goal in this
paper is to give a fast algorithm, based on sparse folding, to calculate an
upper bound on the partition function. Our work is based on the recent
algorithm of Hazan and Jaakkola [10]. The space complexity of our algorithm is
the same as that of sparse folding algorithms, and the time complexity of our
algorithm is for single RNA and for RNA-RNA
interaction in practice, in which is the running time of sparse folding
and () is a sequence dependent parameter
A comprehensive comparison of general RNA-RNA interaction prediction methods
RNA-RNA interactions are fast emerging as a major functional component in many newly discovered non-coding RNAs. Basepairing is believed to be a major contributor to the stability of these intermolecular interactions, much like intramolecular basepairs formed in RNA secondary structure. As such, using algorithms similar to those for predicting RNA secondary structure, computational methods have been recently developed for the prediction of RNA-RNA interactions.We provide the first comprehensive comparison comprising 14 methods that predict general intermolecular basepairs. To evaluate these, we compile an extensive data set of 54 experimentally confirmed fungal snoRNA-rRNA interactions and 102 bacterial sRNA-mRNA interactions. We test the performance accuracy of all methods, evaluating the effects of tool settings, sequence length, and multiple sequence alignment usage and quality.Our results show that-unlike for RNA secondary structure prediction-the overall best performing tools are non-comparative energy-based tools utilizing accessibility information that predict short interactions on this data set. Furthermore, we find that maintaining high accuracy across biologically different data sets and increasing input lengths remains a huge challenge, causing implications for de novo transcriptome-wide searches. Finally, we make our interaction data set publicly available for future development and benchmarking efforts
LinearCoFold and LinearCoPartition: Linear-Time Algorithms for Secondary Structure Prediction of Interacting RNA molecules
Many ncRNAs function through RNA-RNA interactions. Fast and reliable RNA
structure prediction with consideration of RNA-RNA interaction is useful. Some
existing tools are less accurate due to omitting the competing of
intermolecular and intramolecular base pairs, or focus more on predicting the
binding region rather than predicting the complete secondary structure of two
interacting strands. Vienna RNAcofold, which reduces the problem into the
classical single sequence folding by concatenating two strands, scales in cubic
time against the combined sequence length, and is slow for long sequences. To
address these issues, we present LinearCoFold, which predicts the complete
minimum free energy structure of two strands in linear runtime, and
LinearCoPartition, which calculates the cofolding partition function and base
pairing probabilities in linear runtime. LinearCoFold and LinearCoPartition
follows the concatenation strategy of RNAcofold, but are orders of magnitude
faster than RNAcofold. For example, on a sequence pair with combined length of
26,190 nt, LinearCoFold is 86.8x faster than RNAcofold MFE mode (0.6 minutes
vs. 52.1 minutes), and LinearCoPartition is 642.3x faster than RNAcofold
partition function mode (1.8 minutes vs. 1156.2 minutes). Different from the
local algorithms, LinearCoFold and LinearCoPartition are global cofolding
algorithms without restriction on base pair length. Surprisingly, LinearCoFold
and LinearCoPartition's predictions have higher PPV and sensitivity of
intermolecular base pairs. Furthermore, we apply LinearCoFold to predict the
RNA-RNA interaction between SARS-CoV-2 gRNA and human U4 snRNA, which has been
experimentally studied, and observe that LinearCoFold's prediction correlates
better to the wet lab results
Topology of RNA-RNA interaction structures
The topological filtration of interacting RNA complexes is studied and the
role is analyzed of certain diagrams called irreducible shadows, which form
suitable building blocks for more general structures. We prove that for two
interacting RNAs, called interaction structures, there exist for fixed genus
only finitely many irreducible shadows. This implies that for fixed genus there
are only finitely many classes of interaction structures. In particular the
simplest case of genus zero already provides the formalism for certain types of
structures that occur in nature and are not covered by other filtrations. This
case of genus zero interaction structures is already of practical interest, is
studied here in detail and found to be expressed by a multiple context-free
grammar extending the usual one for RNA secondary structures. We show that in
time and space complexity, this grammar for genus zero
interaction structures provides not only minimum free energy solutions but also
the complete partition function and base pairing probabilities.Comment: 40 pages 15 figure
- …