2,117 research outputs found

    Combinatorics of locally optimal RNA secondary structures

    Full text link
    It is a classical result of Stein and Waterman that the asymptotic number of RNA secondary structures is 1.104366⋅n−3/2⋅2.618034n1.104366 \cdot n^{-3/2} \cdot 2.618034^n. Motivated by the kinetics of RNA secondary structure formation, we are interested in determining the asymptotic number of secondary structures that are locally optimal, with respect to a particular energy model. In the Nussinov energy model, where each base pair contributes -1 towards the energy of the structure, locally optimal structures are exactly the saturated structures, for which we have previously shown that asymptotically, there are 1.07427⋅n−3/2⋅2.35467n1.07427\cdot n^{-3/2} \cdot 2.35467^n many saturated structures for a sequence of length nn. In this paper, we consider the base stacking energy model, a mild variant of the Nussinov model, where each stacked base pair contributes -1 toward the energy of the structure. Locally optimal structures with respect to the base stacking energy model are exactly those secondary structures, whose stems cannot be extended. Such structures were first considered by Evers and Giegerich, who described a dynamic programming algorithm to enumerate all locally optimal structures. In this paper, we apply methods from enumerative combinatorics to compute the asymptotic number of such structures. Additionally, we consider analogous combinatorial problems for secondary structures with annotated single-stranded, stacking nucleotides (dangles).Comment: 27 page

    Counting, generating and sampling tree alignments

    Get PDF
    Pairwise ordered tree alignment are combinatorial objects that appear in RNA secondary structure comparison. However, the usual representation of tree alignments as supertrees is ambiguous, i.e. two distinct supertrees may induce identical sets of matches between identical pairs of trees. This ambiguity is uninformative, and detrimental to any probabilistic analysis.In this work, we consider tree alignments up to equivalence. Our first result is a precise asymptotic enumeration of tree alignments, obtained from a context-free grammar by mean of basic analytic combinatorics. Our second result focuses on alignments between two given ordered trees SS and TT. By refining our grammar to align specific trees, we obtain a decomposition scheme for the space of alignments, and use it to design an efficient dynamic programming algorithm for sampling alignments under the Gibbs-Boltzmann probability distribution. This generalizes existing tree alignment algorithms, and opens the door for a probabilistic analysis of the space of suboptimal RNA secondary structures alignments.Comment: ALCOB - 3rd International Conference on Algorithms for Computational Biology - 2016, Jun 2016, Trujillo, Spain. 201

    Geometric combinatorics and computational molecular biology: branching polytopes for RNA sequences

    Full text link
    Questions in computational molecular biology generate various discrete optimization problems, such as DNA sequence alignment and RNA secondary structure prediction. However, the optimal solutions are fundamentally dependent on the parameters used in the objective functions. The goal of a parametric analysis is to elucidate such dependencies, especially as they pertain to the accuracy and robustness of the optimal solutions. Techniques from geometric combinatorics, including polytopes and their normal fans, have been used previously to give parametric analyses of simple models for DNA sequence alignment and RNA branching configurations. Here, we present a new computational framework, and proof-of-principle results, which give the first complete parametric analysis of the branching portion of the nearest neighbor thermodynamic model for secondary structure prediction for real RNA sequences.Comment: 17 pages, 8 figure

    On the uniform generation of modular diagrams

    Full text link
    In this paper we present an algorithm that generates kk-noncrossing, σ\sigma-modular diagrams with uniform probability. A diagram is a labeled graph of degree ≤1\le 1 over nn vertices drawn in a horizontal line with arcs (i,j)(i,j) in the upper half-plane. A kk-crossing in a diagram is a set of kk distinct arcs (i1,j1),(i2,j2),…,(ik,jk)(i_1, j_1), (i_2, j_2),\ldots,(i_k, j_k) with the property i1<i2<…<ik<j1<j2<…<jki_1 < i_2 < \ldots < i_k < j_1 < j_2 < \ldots< j_k. A diagram without any kk-crossings is called a kk-noncrossing diagram and a stack of length σ\sigma is a maximal sequence ((i,j),(i+1,j−1),…,(i+(σ−1),j−(σ−1)))((i,j),(i+1,j-1),\dots,(i+(\sigma-1),j-(\sigma-1))). A diagram is σ\sigma-modular if any arc is contained in a stack of length at least σ\sigma. Our algorithm generates after O(nk)O(n^k) preprocessing time, kk-noncrossing, σ\sigma-modular diagrams in O(n)O(n) time and space complexity.Comment: 21 pages, 7 figure
    • …
    corecore