10 research outputs found
Sparsification of RNA structure prediction including pseudoknots
<p>Abstract</p> <p>Background</p> <p>Although many RNA molecules contain pseudoknots, computational prediction of pseudoknotted RNA structure is still in its infancy due to high running time and space consumption implied by the dynamic programming formulations of the problem.</p> <p>Results</p> <p>In this paper, we introduce sparsification to significantly speedup the dynamic programming approaches for pseudoknotted RNA structure prediction, which also lower the space requirements. Although sparsification has been applied to a number of RNA-related structure prediction problems in the past few years, we provide the first application of sparsification to pseudoknotted RNA structure prediction specifically and to handling gapped fragments more generally - which has a much more complex recursive structure than other problems to which sparsification has been applied. We analyse how to sparsify four pseudoknot structure prediction algorithms, among those the most general method available (the Rivas-Eddy algorithm) and the fastest one (Reeder-Giegerich algorithm). In all algorithms the number of "candidate" substructures to be considered is reduced.</p> <p>Conclusions</p> <p>Our experimental results on the sparsified Reeder-Giegerich algorithm suggest a linear speedup over the unsparsified implementation.</p
Sparsification of RNA Structure Prediction Including Pseudoknots
Background: Although many RNA molecules contain pseudoknots, computational prediction of pseudoknottedRNA structure is still in its infancy due to high running time and space consumption implied by the dynamicprogramming formulations of the problem.Results: In this paper, we introduce sparsification to significantly speedup the dynamic programming approachesfor pseudoknotted RNA structure prediction, which also lower the space requirements. Although sparsification hasbeen applied to a number of RNA-related structure prediction problems in the past few years, we provide the firstapplication of sparsification to pseudoknotted RNA structure prediction specifically and to handling gappedfragments more generally - which has a much more complex recursive structure than other problems to whichsparsification has been applied. We analyse how to sparsify four pseudoknot structure prediction algorithms,among those the most general method available (the Rivas-Eddy algorithm) and the fastest one (Reeder-Giegerichalgorithm). In all algorithms the number of “candidate” substructures to be considered is reduced.Conclusions: Our experimental results on the sparsified Reeder-Giegerich algorithm suggest a linear speedup overthe unsparsified implementation
An Efficient Algorithm for Upper Bound on the Partition Function of Nucleic Acids
It has been shown that minimum free energy structure for RNAs and RNA-RNA
interaction is often incorrect due to inaccuracies in the energy parameters and
inherent limitations of the energy model. In contrast, ensemble based
quantities such as melting temperature and equilibrium concentrations can be
more reliably predicted. Even structure prediction by sampling from the
ensemble and clustering those structures by Sfold [7] has proven to be more
reliable than minimum free energy structure prediction. The main obstacle for
ensemble based approaches is the computational complexity of the partition
function and base pairing probabilities. For instance, the space complexity of
the partition function for RNA-RNA interaction is and the time
complexity is which are prohibitively large [4,12]. Our goal in this
paper is to give a fast algorithm, based on sparse folding, to calculate an
upper bound on the partition function. Our work is based on the recent
algorithm of Hazan and Jaakkola [10]. The space complexity of our algorithm is
the same as that of sparse folding algorithms, and the time complexity of our
algorithm is for single RNA and for RNA-RNA
interaction in practice, in which is the running time of sparse folding
and () is a sequence dependent parameter
On the combinatorics of sparsification
Background: We study the sparsification of dynamic programming folding
algorithms of RNA structures. Sparsification applies to the mfe-folding of RNA
structures and can lead to a significant reduction of time complexity. Results:
We analyze the sparsification of a particular decomposition rule, ,
that splits an interval for RNA secondary and pseudoknot structures of fixed
topological genus. Essential for quantifying the sparsification is the size of
its so called candidate set. We present a combinatorial framework which allows
by means of probabilities of irreducible substructures to obtain the expected
size of the set of -candidates. We compute these expectations for
arc-based energy models via energy-filtered generating functions (GF) for RNA
secondary structures as well as RNA pseudoknot structures. For RNA secondary
structures we also consider a simplified loop-energy model. This combinatorial
analysis is then compared to the expected number of -candidates
obtained from folding mfe-structures. In case of the mfe-folding of RNA
secondary structures with a simplified loop energy model our results imply that
sparsification provides a reduction of time complexity by a constant factor of
91% (theory) versus a 96% reduction (experiment). For the "full" loop-energy
model there is a reduction of 98% (experiment).Comment: 27 pages, 12 figure
Computational analysis of noncoding RNAs
Noncoding RNAs have emerged as important key players in the cell. Understanding their surprisingly diverse range of functions is challenging for experimental and computational biology. Here, we review computational methods to analyze noncoding RNAs. The topics covered include basic and advanced techniques to predict RNA structures, annotation of noncoding RNAs in genomic data, mining RNA-seq data for novel transcripts and prediction of transcript structures, computational aspects of microRNAs, and database resources.Austrian Science Fund (Schrodinger Fellowship J2966-B12)German Research Foundation (grant WI 3628/1-1 to SW)National Institutes of Health (U.S.) (NIH award 1RC1CA147187