175 research outputs found
Impact Of The Energy Model On The Complexity Of RNA Folding With Pseudoknots
International audiencePredicting the folding of an RNA sequence, while allowing general pseudoknots (PK), consists in finding a minimal free-energy matching of its positions. Assuming independently contributing base-pairs, the problem can be solved in -time using a variant of the maximal weighted matching. By contrast, the problem was previously proven NP-Hard in the more realistic nearest-neighbor energy model. In this work, we consider an intermediate model, called the stacking-pairs energy model. We extend a result by Lyngs\o, showing that RNA folding with PK is NP-Hard within a large class of parametrization for the model. We also show the approximability of the problem, by giving a practical algorithm that achieves at least a -approximation for any parametrization of the stacking model. This contrasts nicely with the nearest-neighbor version of the problem, which we prove cannot be approximated within any positive ratio, unless .La prédiction du repliement, avec pseudonoeuds généraux, d'une séquence d'ARN de taille est équivalent à la recherche d'un couplage d'énergie libre minimale. Dans un modèle d'énergie simple, où chaque paire de base contribue indépendamment à l'énergie, ce problème peut être résolu en temps grâce à une variante d'un algorithme de couplage pondéré maximal. Cependant, le même problème a été démontré NP-difficile dans le modèle d'énergie dit des plus proches voisins. Dans ce travail, nous étudions les propriétés du problème sous un modèle d'empilements, constituant un modèle intermédiaire entre ceux d'appariement et des plus proches voisins. Nous démontrons tout d'abord que le repliement avec pseudo-noeuds de l'ARN reste NP-difficile dans de nombreuses valuations du modèle d'énergie. . Par ailleurs, nous montrons que ce problème est approximable, en proposant un algorithme polynomial garantissant une -approximation. Ce résultat illustre une différence essentielle entre ce modèle et celui des plus proches voisins, pour lequel nous montrons qu'il ne peut être approché à aucun ratio positif par un algorithme en temps polynomial sauf si
Ab initio RNA folding
RNA molecules are essential cellular machines performing a wide variety of
functions for which a specific three-dimensional structure is required. Over
the last several years, experimental determination of RNA structures through
X-ray crystallography and NMR seems to have reached a plateau in the number of
structures resolved each year, but as more and more RNA sequences are being
discovered, need for structure prediction tools to complement experimental data
is strong. Theoretical approaches to RNA folding have been developed since the
late nineties when the first algorithms for secondary structure prediction
appeared. Over the last 10 years a number of prediction methods for 3D
structures have been developed, first based on bioinformatics and data-mining,
and more recently based on a coarse-grained physical representation of the
systems. In this review we are going to present the challenges of RNA structure
prediction and the main ideas behind bioinformatic approaches and physics-based
approaches. We will focus on the description of the more recent physics-based
phenomenological models and on how they are built to include the specificity of
the interactions of RNA bases, whose role is critical in folding. Through
examples from different models, we will point out the strengths of
physics-based approaches, which are able not only to predict equilibrium
structures, but also to investigate dynamical and thermodynamical behavior, and
the open challenges to include more key interactions ruling RNA folding.Comment: 28 pages, 18 figure
Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics
BACKGROUND: The general problem of RNA secondary structure prediction under the widely used thermodynamic model is known to be NP-complete when the structures considered include arbitrary pseudoknots. For restricted classes of pseudoknots, several polynomial time algorithms have been designed, where the O(n(6))time and O(n(4)) space algorithm by Rivas and Eddy is currently the best available program. RESULTS: We introduce the class of canonical simple recursive pseudoknots and present an algorithm that requires O(n(4)) time and O(n(2)) space to predict the energetically optimal structure of an RNA sequence, possible containing such pseudoknots. Evaluation against a large collection of known pseudoknotted structures shows the adequacy of the canonization approach and our algorithm. CONCLUSIONS: RNA pseudoknots of medium size can now be predicted reliably as well as efficiently by the new algorithm
Pseudoknots in a Homopolymer
After a discussion of the definition and number of pseudoknots, we reconsider
the self-attracting homopolymer paying particular attention to the scaling of
the number of pseudoknots at different temperature regimes in two and three
dimensions. Although the total number of pseudoknots is extensive at all
temperatures, we find that the number of pseudoknots forming between the two
halves of the chain diverges logarithmically at (in both dimensions) and below
(in 2d only) the theta-temparature. We later introduce a simple model that is
sensitive to pseudoknot formation during collapse. The resulting phase diagram
involves swollen, branched and collapsed homopolymer phases with transitions
between each pair.Comment: submitted to PR
Target prediction and a statistical sampling algorithm for RNA-RNA interaction
It has been proven that the accessibility of the target sites has a critical
influence for miRNA and siRNA. In this paper, we present a program, rip2.0, not
only the energetically most favorable targets site based on the
hybrid-probability, but also a statistical sampling structure to illustrate the
statistical characterization and representation of the Boltzmann ensemble of
RNA-RNA interaction structures. The outputs are retrieved via backtracing an
improved dynamic programming solution for the partition function based on the
approach of Huang et al. (Bioinformatics). The time and space
algorithm is implemented in C (available from
\url{http://www.combinatorics.cn/cbpc/rip2.html})Comment: 7 pages, 10 figure
Algorithms for RNA secondary structure analysis : prediction of pseudoknots and the consensus shapes approach
Reeder J. Algorithms for RNA secondary structure analysis : prediction of pseudoknots and the consensus shapes approach. Bielefeld (Germany): Bielefeld University; 2007.Our understanding of the role of RNA has undergone a major change in the last decade. Once believed to be only a mere carrier of information and structural component of the ribosomal machinery in the advent of the genomic age, it is now clear that RNAs play a much more active role. RNAs can act as regulators and can have catalytic activity - roles previously only attributed to proteins. There is still much speculation in the scientific community as to what extent RNAs are responsible for the complexity in higher organisms which can hardly be explained with only proteins as regulators.
In order to investigate the roles of RNA, it is therefore necessary to search for new classes of RNA. For those and already known classes, analyses of their presence in different species of the tree of life will provide further insight about the evolution of biomolecules and especially RNAs. Since RNA function often follows its structure, the need for computer programs for RNA structure prediction is an immanent part of this procedure. The secondary structure of RNA - the level of base pairing - strongly determines the tertiary structure. As the latter is computationally intractable and experimentally expensive to obtain, secondary structure analysis has become an accepted substitute. In this thesis, I present two new algorithms (and a few variations thereof) for the prediction of RNA secondary structures.
The first algorithm addresses the problem of predicting a secondary structure from a single sequence including RNA pseudoknots. Pseudoknots have been shown to be functionally relevant in many RNA mediated processes. However, pseudoknots are excluded from considerations by state-of-the-art RNA folding programs for reasons of computational complexity. While folding a sequence of length n into unknotted structures requires O(n^3) time and O(n^2) space, finding the best structure including arbitrary pseudoknots has been proven to be NP-complete. Nevertheless, I demonstrate in this work that certain types of pseudoknots can be included in the folding process with only a moderate increase of computational cost.
In analogy to protein coding RNA, where a conserved encoded protein hints at a similar metabolic function, structural conservation in RNA may give clues to RNA function and to finding of RNA genes. However, structure conservation is more complex to deal with computationally than sequence conservation. The method considered to be at least conceptually the ideal approach in this situation is the Sankoff algorithm. It simultaneously aligns two sequences and predicts a common secondary structure. Unfortunately, it is computationally rather expensive - O(n^6) time and O(n^4) space for two sequences, and for more than two sequences it becomes exponential in the number of sequences! Therefore, several heuristic implementations emerged in the last decade trying to make the Sankoff approach practical by introducing pragmatic restrictions on the search space.
In this thesis, I propose to redefine the consensus structure prediction problem in a way that does not imply a multiple sequence alignment step. For a family of RNA sequences, my method explicitly and independently enumerates the near-optimal abstract shape space and predicts an abstract shape as the consensus for all sequences. For each sequence, it delivers the thermodynamically best structure which has this shape. The technique of abstract shapes analysis is employed here for a synoptic view of the suboptimal folding space. As the shape space is much smaller than the structure space, and identification of common shapes can be done in linear time (in the number of shapes considered), the method is essentially linear in the number of sequences. Evaluations show that the new method compares favorably with available alternatives
DotKnot: pseudoknot prediction using the probability dot plot under a refined energy model
RNA pseudoknots are functional structure elements with key roles in viral and cellular processes. Prediction of a pseudoknotted minimum free energy structure is an NP-complete problem. Practical algorithms for RNA structure prediction including restricted classes of pseudoknots suffer from high runtime and poor accuracy for longer sequences. A heuristic approach is to search for promising pseudoknot candidates in a sequence and verify those. Afterwards, the detected pseudoknots can be further analysed using bioinformatics or laboratory techniques. We present a novel pseudoknot detection method called DotKnot that extracts stem regions from the secondary structure probability dot plot and assembles pseudoknot candidates in a constructive fashion. We evaluate pseudoknot free energies using novel parameters, which have recently become available. We show that the conventional probability dot plot makes a wide class of pseudoknots including those with bulged stems manageable in an explicit fashion. The energy parameters now become the limiting factor in pseudoknot prediction. DotKnot is an efficient method for long sequences, which finds pseudoknots with higher accuracy compared to other known prediction algorithms. DotKnot is accessible as a web server at http://dotknot.csse.uwa.edu.au
Encoding folding paths of RNA switches
RNA co-transcriptional folding has long been suspected to play an active role
in helping proper native folding of ribozymes and structured regulatory motifs
in mRNA untranslated regions. Yet, the underlying mechanisms and coding
requirements for efficient co-transcriptional folding remain unclear.
Traditional approaches have intrinsic limitations to dissect RNA folding paths,
as they rely on sequence mutations or circular permutations that typically
perturb both RNA folding paths and equilibrium structures. Here, we show that
exploiting sequence symmetries instead of mutations can circumvent this problem
by essentially decoupling folding paths from equilibrium structures of designed
RNA sequences. Using bistable RNA switches with symmetrical helices conserved
under sequence reversal, we demonstrate experimentally that native and
transiently formed helices can guide efficient co-transcriptional folding into
either long-lived structure of these RNA switches. Their folding path is
controlled by the order of helix nucleations and subsequent exchanges during
transcription, and may also be redirected by transient antisense interactions.
Hence, transient intra- and intermolecular base pair interactions can
effectively regulate the folding of nascent RNA molecules into different native
structures, provided limited coding requirements, as discussed from an
information theory perspective. This constitutive coupling between RNA
synthesis and RNA folding regulation may have enabled the early emergence of
autonomous RNA-based regulation networks.Comment: 9 pages, 6 figure
Force-induced denaturation of RNA
We describe quantitatively a RNA molecule under the influence of an external
force exerted at its two ends as in a typical single-molecule experiment. Our
calculation incorporates the interactions between nucleotides by using the
experimentally-determined free energy rules for RNA secondary structure and
models the polymeric properties of the exterior single-stranded regions
explicitly as elastic freely-jointed chains. We find that in spite of
complicated secondary structures, force-extension curves are typically smooth
in quasi-equilibrium. We identify and characterize two
sequence/structure-dependent mechanisms that, in addition to the
sequence-independent entropic elasticity of the exterior single-stranded
regions, are responsible for the smoothness. These involve compensation between
different structural elements on which the external force acts simultaneously,
and contribution of suboptimal structures, respectively. We estimate how many
features a force-extension curve recorded in non-equilibrium, where the pulling
proceeds faster than rearrangements in the secondary structure of the molecule,
could show in principle. Our software is available to the public through a
`RNA-pulling server'.Comment: final version (with a few minor changes) as will be published in
Biophysical Journa
- …