196 research outputs found
Prediction of RNA pseudoknots by Monte Carlo simulations
In this paper we consider the problem of RNA folding with pseudoknots. We use
a graphical representation in which the secondary structures are described by
planar diagrams. Pseudoknots are identified as non-planar diagrams. We analyze
the non-planar topologies of RNA structures and propose a classification of RNA
pseudoknots according to the minimal genus of the surface on which the RNA
structure can be embedded. This classification provides a simple and natural
way to tackle the problem of RNA folding prediction in presence of pseudoknots.
Based on that approach, we describe a Monte Carlo algorithm for the prediction
of pseudoknots in an RNA molecule.Comment: 22 pages, 14 figure
Shapes of topological RNA structures
A topological RNA structure is derived from a diagram and its shape is
obtained by collapsing the stacks of the structure into single arcs and by
removing any arcs of length one. Shapes contain key topological, information
and for fixed topological genus there exist only finitely many such shapes. We
shall express topological RNA structures as unicellular maps, i.e. graphs
together with a cyclic ordering of their half-edges. In this paper we prove a
bijection of shapes of topological RNA structures. We furthermore derive a
linear time algorithm generating shapes of fixed topological genus. We derive
explicit expressions for the coefficients of the generating polynomial of these
shapes and the generating function of RNA structures of genus . Furthermore
we outline how shapes can be used in order to extract essential information of
RNA structure databases.Comment: 27 pages, 11 figures, 2 tables. arXiv admin note: text overlap with
arXiv:1304.739
Thermodynamic Analysis of Interacting Nucleic Acid Strands
Motivated by the analysis of natural and engineered DNA and RNA systems, we present the first algorithm for calculating the partition function of an unpseudoknotted complex of multiple interacting nucleic acid strands. This dynamic program is based on a rigorous extension of secondary structure models to the multistranded case, addressing representation and distinguishability issues that do not arise for single-stranded structures. We then derive the form of the partition function for a fixed volume containing a dilute solution of nucleic acid complexes. This expression can be evaluated explicitly for small numbers of strands, allowing the calculation of the equilibrium population distribution for each species of complex. Alternatively, for large systems (e.g., a test tube), we show that the unique complex concentrations corresponding to thermodynamic equilibrium can be obtained by solving a convex programming problem. Partition function and concentration information can then be used to calculate equilibrium base-pairing observables. The underlying physics and mathematical formulation of these problems lead to an interesting blend of approaches, including ideas from graph theory, group theory, dynamic programming, combinatorics, convex optimization, and Lagrange duality
On the combinatorics of sparsification
Background: We study the sparsification of dynamic programming folding
algorithms of RNA structures. Sparsification applies to the mfe-folding of RNA
structures and can lead to a significant reduction of time complexity. Results:
We analyze the sparsification of a particular decomposition rule, ,
that splits an interval for RNA secondary and pseudoknot structures of fixed
topological genus. Essential for quantifying the sparsification is the size of
its so called candidate set. We present a combinatorial framework which allows
by means of probabilities of irreducible substructures to obtain the expected
size of the set of -candidates. We compute these expectations for
arc-based energy models via energy-filtered generating functions (GF) for RNA
secondary structures as well as RNA pseudoknot structures. For RNA secondary
structures we also consider a simplified loop-energy model. This combinatorial
analysis is then compared to the expected number of -candidates
obtained from folding mfe-structures. In case of the mfe-folding of RNA
secondary structures with a simplified loop energy model our results imply that
sparsification provides a reduction of time complexity by a constant factor of
91% (theory) versus a 96% reduction (experiment). For the "full" loop-energy
model there is a reduction of 98% (experiment).Comment: 27 pages, 12 figure
Topology of RNA-RNA interaction structures
The topological filtration of interacting RNA complexes is studied and the
role is analyzed of certain diagrams called irreducible shadows, which form
suitable building blocks for more general structures. We prove that for two
interacting RNAs, called interaction structures, there exist for fixed genus
only finitely many irreducible shadows. This implies that for fixed genus there
are only finitely many classes of interaction structures. In particular the
simplest case of genus zero already provides the formalism for certain types of
structures that occur in nature and are not covered by other filtrations. This
case of genus zero interaction structures is already of practical interest, is
studied here in detail and found to be expressed by a multiple context-free
grammar extending the usual one for RNA secondary structures. We show that in
time and space complexity, this grammar for genus zero
interaction structures provides not only minimum free energy solutions but also
the complete partition function and base pairing probabilities.Comment: 40 pages 15 figure
- …