457 research outputs found

    Impact Of The Energy Model On The Complexity Of RNA Folding With Pseudoknots

    Get PDF
    International audiencePredicting the folding of an RNA sequence, while allowing general pseudoknots (PK), consists in finding a minimal free-energy matching of its nn positions. Assuming independently contributing base-pairs, the problem can be solved in Θ(n3)\Theta(n^3)-time using a variant of the maximal weighted matching. By contrast, the problem was previously proven NP-Hard in the more realistic nearest-neighbor energy model. In this work, we consider an intermediate model, called the stacking-pairs energy model. We extend a result by Lyngs\o, showing that RNA folding with PK is NP-Hard within a large class of parametrization for the model. We also show the approximability of the problem, by giving a practical Θ(n3)\Theta(n^3) algorithm that achieves at least a 55-approximation for any parametrization of the stacking model. This contrasts nicely with the nearest-neighbor version of the problem, which we prove cannot be approximated within any positive ratio, unless P=NPP=NP.La prédiction du repliement, avec pseudonoeuds généraux, d'une séquence d'ARN de taille nn est équivalent à la recherche d'un couplage d'énergie libre minimale. Dans un modèle d'énergie simple, où chaque paire de base contribue indépendamment à l'énergie, ce problème peut être résolu en temps Θ(n3)\Theta(n^3) grâce à une variante d'un algorithme de couplage pondéré maximal. Cependant, le même problème a été démontré NP-difficile dans le modèle d'énergie dit des plus proches voisins. Dans ce travail, nous étudions les propriétés du problème sous un modèle d'empilements, constituant un modèle intermédiaire entre ceux d'appariement et des plus proches voisins. Nous démontrons tout d'abord que le repliement avec pseudo-noeuds de l'ARN reste NP-difficile dans de nombreuses valuations du modèle d'énergie. . Par ailleurs, nous montrons que ce problème est approximable, en proposant un algorithme polynomial garantissant une 1/51/5-approximation. Ce résultat illustre une différence essentielle entre ce modèle et celui des plus proches voisins, pour lequel nous montrons qu'il ne peut être approché à aucun ratio positif par un algorithme en temps polynomial sauf si N=NPN=NP

    TT2NE: A novel algorithm to predict RNA secondary structures with pseudoknots

    Get PDF
    We present TT2NE, a new algorithm to predict RNA secondary structures with pseudoknots. The method is based on a classification of RNA structures according to their topological genus. TT2NE guarantees to find the minimum free energy structure irrespectively of pseudoknot topology. This unique proficiency is obtained at the expense of the maximum length of sequence that can be treated but comparison with state-of-the-art algorithms shows that TT2NE is a very powerful tool within its limits. Analysis of TT2NE's wrong predictions sheds light on the need to study how sterical constraints limit the range of pseudoknotted structures that can be formed from a given sequence. An implementation of TT2NE on a public server can be found at http://ipht.cea.fr/rna/tt2ne.php

    On the combinatorics of sparsification

    Get PDF
    Background: We study the sparsification of dynamic programming folding algorithms of RNA structures. Sparsification applies to the mfe-folding of RNA structures and can lead to a significant reduction of time complexity. Results: We analyze the sparsification of a particular decomposition rule, Λ∗\Lambda^*, that splits an interval for RNA secondary and pseudoknot structures of fixed topological genus. Essential for quantifying the sparsification is the size of its so called candidate set. We present a combinatorial framework which allows by means of probabilities of irreducible substructures to obtain the expected size of the set of Λ∗\Lambda^*-candidates. We compute these expectations for arc-based energy models via energy-filtered generating functions (GF) for RNA secondary structures as well as RNA pseudoknot structures. For RNA secondary structures we also consider a simplified loop-energy model. This combinatorial analysis is then compared to the expected number of Λ∗\Lambda^*-candidates obtained from folding mfe-structures. In case of the mfe-folding of RNA secondary structures with a simplified loop energy model our results imply that sparsification provides a reduction of time complexity by a constant factor of 91% (theory) versus a 96% reduction (experiment). For the "full" loop-energy model there is a reduction of 98% (experiment).Comment: 27 pages, 12 figure

    An Efficient Algorithm for Upper Bound on the Partition Function of Nucleic Acids

    Full text link
    It has been shown that minimum free energy structure for RNAs and RNA-RNA interaction is often incorrect due to inaccuracies in the energy parameters and inherent limitations of the energy model. In contrast, ensemble based quantities such as melting temperature and equilibrium concentrations can be more reliably predicted. Even structure prediction by sampling from the ensemble and clustering those structures by Sfold [7] has proven to be more reliable than minimum free energy structure prediction. The main obstacle for ensemble based approaches is the computational complexity of the partition function and base pairing probabilities. For instance, the space complexity of the partition function for RNA-RNA interaction is O(n4)O(n^4) and the time complexity is O(n6)O(n^6) which are prohibitively large [4,12]. Our goal in this paper is to give a fast algorithm, based on sparse folding, to calculate an upper bound on the partition function. Our work is based on the recent algorithm of Hazan and Jaakkola [10]. The space complexity of our algorithm is the same as that of sparse folding algorithms, and the time complexity of our algorithm is O(MFE(n)ℓ)O(MFE(n)\ell) for single RNA and O(MFE(m,n)ℓ)O(MFE(m, n)\ell) for RNA-RNA interaction in practice, in which MFEMFE is the running time of sparse folding and ℓ≤n\ell \leq n (ℓ≤n+m\ell \leq n + m) is a sequence dependent parameter

    Ab initio RNA folding

    Full text link
    RNA molecules are essential cellular machines performing a wide variety of functions for which a specific three-dimensional structure is required. Over the last several years, experimental determination of RNA structures through X-ray crystallography and NMR seems to have reached a plateau in the number of structures resolved each year, but as more and more RNA sequences are being discovered, need for structure prediction tools to complement experimental data is strong. Theoretical approaches to RNA folding have been developed since the late nineties when the first algorithms for secondary structure prediction appeared. Over the last 10 years a number of prediction methods for 3D structures have been developed, first based on bioinformatics and data-mining, and more recently based on a coarse-grained physical representation of the systems. In this review we are going to present the challenges of RNA structure prediction and the main ideas behind bioinformatic approaches and physics-based approaches. We will focus on the description of the more recent physics-based phenomenological models and on how they are built to include the specificity of the interactions of RNA bases, whose role is critical in folding. Through examples from different models, we will point out the strengths of physics-based approaches, which are able not only to predict equilibrium structures, but also to investigate dynamical and thermodynamical behavior, and the open challenges to include more key interactions ruling RNA folding.Comment: 28 pages, 18 figure

    Review of algorithms for RNA secondary structure prediction with pseudoknots

    Get PDF
    Pseudoknots are structures that are formed from the base pairing of an RNA secondary loop structure with a complementary base which lies somewhere outside of the loop. The result is a structure, which plays a vital role in cell structure rigidity, regulation of protein synthesis, and in the structural organization of RNA complexes. Deciphering RNA folding patterns would begin to unravel some of the mysteries surrounding the cell and its functions and open a new world to scientists. Many algorithms have been written in this quest to predict RNA\u27s secondary structure but not many have been very successful. In this thesis, some of these algorithms are discussed and considered for their strengths and weaknesses. First those algorithms, which exclude pseudoknots and other more complex structures, are presented. The later algorithms include those, which attempt to include some of the more complex structures into their calculations. In the end, all the algorithms are taken into consideration and their strengths and weaknesses compared so as to find some path for future direction. By using the strengths found in these variety of algorithms and avoiding some of the piffalls encountered by others hopefully new algorithms will be developed in the future that are more successful in deciphering RNA secondary structure

    Monovalent ions modulate the flux through multiple folding pathways of an RNA pseudoknot

    Get PDF
    The functions of RNA pseudoknots (PKs), which are minimal tertiary structural motifs and an integral part of several ribozymes and ribonucleoprotein complexes, are determined by their structure, stability and dynamics. Therefore, it is important to elucidate the general principles governing their thermodynamics/folding mechanisms. Here, we combine experiments and simulations to examine the folding/unfolding pathways of the VPK pseudoknot, a variant of the Mouse Mammary Tumor Virus (MMTV) PK involved in ribosomal frameshifting. Fluorescent nucleotide analogs (2-aminopurine and pyrrolocytidine) placed at different stem/loop positions in the PK, and laser temperature-jump approaches serve as local probes allowing us to monitor the order of assembly of VPK with two helices with different intrinsic stabilities. The experiments and molecular simulations show that at 50 mM KCl the dominant folding pathway populates only the more stable partially folded hairpin. As the salt concentration is increased a parallel folding pathway emerges, involving the less stable hairpin structure as an alternate intermediate. Notably, the flux between the pathways is modulated by the ionic strength. The findings support the principle that the order of PK structure formation is determined by the relative stabilities of the hairpins, which can be altered by sequence variations or salt concentrations. Our study not only unambiguously demonstrates that PK folds by parallel pathways, but also establishes that quantitative description of RNA self-assembly requires a synergistic combination of experiments and simulations.Comment: Supporting Information include

    Computational analysis of noncoding RNAs

    Get PDF
    Noncoding RNAs have emerged as important key players in the cell. Understanding their surprisingly diverse range of functions is challenging for experimental and computational biology. Here, we review computational methods to analyze noncoding RNAs. The topics covered include basic and advanced techniques to predict RNA structures, annotation of noncoding RNAs in genomic data, mining RNA-seq data for novel transcripts and prediction of transcript structures, computational aspects of microRNAs, and database resources.Austrian Science Fund (Schrodinger Fellowship J2966-B12)German Research Foundation (grant WI 3628/1-1 to SW)National Institutes of Health (U.S.) (NIH award 1RC1CA147187
    • …
    corecore