2,117 research outputs found
Combinatorics of locally optimal RNA secondary structures
It is a classical result of Stein and Waterman that the asymptotic number of
RNA secondary structures is .
Motivated by the kinetics of RNA secondary structure formation, we are
interested in determining the asymptotic number of secondary structures that
are locally optimal, with respect to a particular energy model. In the Nussinov
energy model, where each base pair contributes -1 towards the energy of the
structure, locally optimal structures are exactly the saturated structures, for
which we have previously shown that asymptotically, there are many saturated structures for a sequence of length
. In this paper, we consider the base stacking energy model, a mild variant
of the Nussinov model, where each stacked base pair contributes -1 toward the
energy of the structure. Locally optimal structures with respect to the base
stacking energy model are exactly those secondary structures, whose stems
cannot be extended. Such structures were first considered by Evers and
Giegerich, who described a dynamic programming algorithm to enumerate all
locally optimal structures. In this paper, we apply methods from enumerative
combinatorics to compute the asymptotic number of such structures.
Additionally, we consider analogous combinatorial problems for secondary
structures with annotated single-stranded, stacking nucleotides (dangles).Comment: 27 page
Counting, generating and sampling tree alignments
Pairwise ordered tree alignment are combinatorial objects that appear in RNA
secondary structure comparison. However, the usual representation of tree
alignments as supertrees is ambiguous, i.e. two distinct supertrees may induce
identical sets of matches between identical pairs of trees. This ambiguity is
uninformative, and detrimental to any probabilistic analysis.In this work, we
consider tree alignments up to equivalence. Our first result is a precise
asymptotic enumeration of tree alignments, obtained from a context-free grammar
by mean of basic analytic combinatorics. Our second result focuses on
alignments between two given ordered trees and . By refining our grammar
to align specific trees, we obtain a decomposition scheme for the space of
alignments, and use it to design an efficient dynamic programming algorithm for
sampling alignments under the Gibbs-Boltzmann probability distribution. This
generalizes existing tree alignment algorithms, and opens the door for a
probabilistic analysis of the space of suboptimal RNA secondary structures
alignments.Comment: ALCOB - 3rd International Conference on Algorithms for Computational
Biology - 2016, Jun 2016, Trujillo, Spain. 201
Geometric combinatorics and computational molecular biology: branching polytopes for RNA sequences
Questions in computational molecular biology generate various discrete
optimization problems, such as DNA sequence alignment and RNA secondary
structure prediction. However, the optimal solutions are fundamentally
dependent on the parameters used in the objective functions. The goal of a
parametric analysis is to elucidate such dependencies, especially as they
pertain to the accuracy and robustness of the optimal solutions. Techniques
from geometric combinatorics, including polytopes and their normal fans, have
been used previously to give parametric analyses of simple models for DNA
sequence alignment and RNA branching configurations. Here, we present a new
computational framework, and proof-of-principle results, which give the first
complete parametric analysis of the branching portion of the nearest neighbor
thermodynamic model for secondary structure prediction for real RNA sequences.Comment: 17 pages, 8 figure
On the uniform generation of modular diagrams
In this paper we present an algorithm that generates -noncrossing,
-modular diagrams with uniform probability. A diagram is a labeled
graph of degree over vertices drawn in a horizontal line with arcs
in the upper half-plane. A -crossing in a diagram is a set of
distinct arcs with the property . A diagram without any
-crossings is called a -noncrossing diagram and a stack of length
is a maximal sequence
. A diagram is
-modular if any arc is contained in a stack of length at least
. Our algorithm generates after preprocessing time,
-noncrossing, -modular diagrams in time and space
complexity.Comment: 21 pages, 7 figure
- …