171 research outputs found
Geometric combinatorics and computational molecular biology: branching polytopes for RNA sequences
Questions in computational molecular biology generate various discrete
optimization problems, such as DNA sequence alignment and RNA secondary
structure prediction. However, the optimal solutions are fundamentally
dependent on the parameters used in the objective functions. The goal of a
parametric analysis is to elucidate such dependencies, especially as they
pertain to the accuracy and robustness of the optimal solutions. Techniques
from geometric combinatorics, including polytopes and their normal fans, have
been used previously to give parametric analyses of simple models for DNA
sequence alignment and RNA branching configurations. Here, we present a new
computational framework, and proof-of-principle results, which give the first
complete parametric analysis of the branching portion of the nearest neighbor
thermodynamic model for secondary structure prediction for real RNA sequences.Comment: 17 pages, 8 figure
Combinatorial analysis of interacting RNA molecules
Recently several minimum free energy (MFE) folding algorithms for predicting
the joint structure of two interacting RNA molecules have been proposed. Their
folding targets are interaction structures, that can be represented as diagrams
with two backbones drawn horizontally on top of each other such that (1)
intramolecular and intermolecular bonds are noncrossing and (2) there is no
"zig-zag" configuration. This paper studies joint structures with arc-length at
least four in which both, interior and exterior stack-lengths are at least two
(no isolated arcs). The key idea in this paper is to consider a new type of
shape, based on which joint structures can be derived via symbolic enumeration.
Our results imply simple asymptotic formulas for the number of joint structures
with surprisingly small exponential growth rates. They are of interest in the
context of designing prediction algorithms for RNA-RNA interactions.Comment: 22 pages, 15 figure
Shapes of interacting RNA complexes
Shapes of interacting RNA complexes are studied using a filtration via their
topological genus. A shape of an RNA complex is obtained by (iteratively)
collapsing stacks and eliminating hairpin loops. This shape-projection
preserves the topological core of the RNA complex and for fixed topological
genus there are only finitely many such shapes.Our main result is a new
bijection that relates the shapes of RNA complexes with shapes of RNA
structures.This allows to compute the shape polynomial of RNA complexes via the
shape polynomial of RNA structures. We furthermore present a linear time
uniform sampling algorithm for shapes of RNA complexes of fixed topological
genus.Comment: 38 pages 24 figure
On the combinatorics of sparsification
Background: We study the sparsification of dynamic programming folding
algorithms of RNA structures. Sparsification applies to the mfe-folding of RNA
structures and can lead to a significant reduction of time complexity. Results:
We analyze the sparsification of a particular decomposition rule, ,
that splits an interval for RNA secondary and pseudoknot structures of fixed
topological genus. Essential for quantifying the sparsification is the size of
its so called candidate set. We present a combinatorial framework which allows
by means of probabilities of irreducible substructures to obtain the expected
size of the set of -candidates. We compute these expectations for
arc-based energy models via energy-filtered generating functions (GF) for RNA
secondary structures as well as RNA pseudoknot structures. For RNA secondary
structures we also consider a simplified loop-energy model. This combinatorial
analysis is then compared to the expected number of -candidates
obtained from folding mfe-structures. In case of the mfe-folding of RNA
secondary structures with a simplified loop energy model our results imply that
sparsification provides a reduction of time complexity by a constant factor of
91% (theory) versus a 96% reduction (experiment). For the "full" loop-energy
model there is a reduction of 98% (experiment).Comment: 27 pages, 12 figure
Combinatorics of locally optimal RNA secondary structures
It is a classical result of Stein and Waterman that the asymptotic number of
RNA secondary structures is .
Motivated by the kinetics of RNA secondary structure formation, we are
interested in determining the asymptotic number of secondary structures that
are locally optimal, with respect to a particular energy model. In the Nussinov
energy model, where each base pair contributes -1 towards the energy of the
structure, locally optimal structures are exactly the saturated structures, for
which we have previously shown that asymptotically, there are many saturated structures for a sequence of length
. In this paper, we consider the base stacking energy model, a mild variant
of the Nussinov model, where each stacked base pair contributes -1 toward the
energy of the structure. Locally optimal structures with respect to the base
stacking energy model are exactly those secondary structures, whose stems
cannot be extended. Such structures were first considered by Evers and
Giegerich, who described a dynamic programming algorithm to enumerate all
locally optimal structures. In this paper, we apply methods from enumerative
combinatorics to compute the asymptotic number of such structures.
Additionally, we consider analogous combinatorial problems for secondary
structures with annotated single-stranded, stacking nucleotides (dangles).Comment: 27 page
- …