314 research outputs found

    On the combinatorics of sparsification

    Get PDF
    Background: We study the sparsification of dynamic programming folding algorithms of RNA structures. Sparsification applies to the mfe-folding of RNA structures and can lead to a significant reduction of time complexity. Results: We analyze the sparsification of a particular decomposition rule, Λ\Lambda^*, that splits an interval for RNA secondary and pseudoknot structures of fixed topological genus. Essential for quantifying the sparsification is the size of its so called candidate set. We present a combinatorial framework which allows by means of probabilities of irreducible substructures to obtain the expected size of the set of Λ\Lambda^*-candidates. We compute these expectations for arc-based energy models via energy-filtered generating functions (GF) for RNA secondary structures as well as RNA pseudoknot structures. For RNA secondary structures we also consider a simplified loop-energy model. This combinatorial analysis is then compared to the expected number of Λ\Lambda^*-candidates obtained from folding mfe-structures. In case of the mfe-folding of RNA secondary structures with a simplified loop energy model our results imply that sparsification provides a reduction of time complexity by a constant factor of 91% (theory) versus a 96% reduction (experiment). For the "full" loop-energy model there is a reduction of 98% (experiment).Comment: 27 pages, 12 figure

    On the uniform generation of modular diagrams

    Full text link
    In this paper we present an algorithm that generates kk-noncrossing, σ\sigma-modular diagrams with uniform probability. A diagram is a labeled graph of degree 1\le 1 over nn vertices drawn in a horizontal line with arcs (i,j)(i,j) in the upper half-plane. A kk-crossing in a diagram is a set of kk distinct arcs (i1,j1),(i2,j2),,(ik,jk)(i_1, j_1), (i_2, j_2),\ldots,(i_k, j_k) with the property i1<i2<<ik<j1<j2<<jki_1 < i_2 < \ldots < i_k < j_1 < j_2 < \ldots< j_k. A diagram without any kk-crossings is called a kk-noncrossing diagram and a stack of length σ\sigma is a maximal sequence ((i,j),(i+1,j1),,(i+(σ1),j(σ1)))((i,j),(i+1,j-1),\dots,(i+(\sigma-1),j-(\sigma-1))). A diagram is σ\sigma-modular if any arc is contained in a stack of length at least σ\sigma. Our algorithm generates after O(nk)O(n^k) preprocessing time, kk-noncrossing, σ\sigma-modular diagrams in O(n)O(n) time and space complexity.Comment: 21 pages, 7 figure

    Shapes of topological RNA structures

    Full text link
    A topological RNA structure is derived from a diagram and its shape is obtained by collapsing the stacks of the structure into single arcs and by removing any arcs of length one. Shapes contain key topological, information and for fixed topological genus there exist only finitely many such shapes. We shall express topological RNA structures as unicellular maps, i.e. graphs together with a cyclic ordering of their half-edges. In this paper we prove a bijection of shapes of topological RNA structures. We furthermore derive a linear time algorithm generating shapes of fixed topological genus. We derive explicit expressions for the coefficients of the generating polynomial of these shapes and the generating function of RNA structures of genus gg. Furthermore we outline how shapes can be used in order to extract essential information of RNA structure databases.Comment: 27 pages, 11 figures, 2 tables. arXiv admin note: text overlap with arXiv:1304.739

    Topology of RNA-RNA interaction structures

    Get PDF
    The topological filtration of interacting RNA complexes is studied and the role is analyzed of certain diagrams called irreducible shadows, which form suitable building blocks for more general structures. We prove that for two interacting RNAs, called interaction structures, there exist for fixed genus only finitely many irreducible shadows. This implies that for fixed genus there are only finitely many classes of interaction structures. In particular the simplest case of genus zero already provides the formalism for certain types of structures that occur in nature and are not covered by other filtrations. This case of genus zero interaction structures is already of practical interest, is studied here in detail and found to be expressed by a multiple context-free grammar extending the usual one for RNA secondary structures. We show that in O(n6)O(n^6) time and O(n4)O(n^4) space complexity, this grammar for genus zero interaction structures provides not only minimum free energy solutions but also the complete partition function and base pairing probabilities.Comment: 40 pages 15 figure
    corecore