33 research outputs found
The Discrepancy of the Lex-Least De Bruijn Sequence
We answer the following question of R. L. Graham: What is the discrepancy of
the lexicographically-least binary de Bruijn sequence? Here, "discrepancy"
refers to the maximum (absolute) difference between the number of ones and the
number of zeros in any initial segment of the sequence. We show that the answer
is .Comment: 11 pages, 0 figure
RNAprofiling 2.0: Enhanced cluster analysis of structural ensembles
Understanding the base pairing of an RNA sequence provides insight into its
molecular structure.By mining suboptimal sampling data, RNAprofiling 1.0
identifies the dominant helices in low-energy secondary structures as features,
organizes them into profiles which partition the Boltzmann sample, and
highlights key similarities/differences among the most informative, i.e.
selected, profiles in a graphical format. Version 2.0 enhances every step of
this approach. First, the featured substructures are expanded from helices to
stems. Second, profile selection includes low-frequency pairings similar to
featured ones. In conjunction, these updates extend the utility of the method
to sequences up to length 600, as evaluated over a sizable dataset. Third,
relationships are visualized in a decision tree which highlights the most
important structural differences. Finally, this cluster analysis is made
accessible to experimental researchers in a portable format as an interactive
webpage, permitting a much greater understanding of trade-offs among different
possible base pairing combinations.Comment: 9 pages, 2 figures; supplement 6 pages, 3 figures, 1 tabl
Geometric combinatorics and computational molecular biology: branching polytopes for RNA sequences
Questions in computational molecular biology generate various discrete
optimization problems, such as DNA sequence alignment and RNA secondary
structure prediction. However, the optimal solutions are fundamentally
dependent on the parameters used in the objective functions. The goal of a
parametric analysis is to elucidate such dependencies, especially as they
pertain to the accuracy and robustness of the optimal solutions. Techniques
from geometric combinatorics, including polytopes and their normal fans, have
been used previously to give parametric analyses of simple models for DNA
sequence alignment and RNA branching configurations. Here, we present a new
computational framework, and proof-of-principle results, which give the first
complete parametric analysis of the branching portion of the nearest neighbor
thermodynamic model for secondary structure prediction for real RNA sequences.Comment: 17 pages, 8 figure
Meander Graphs
We consider a Markov chain Monte Carlo approach to the uniform sampling of meanders. Combinatorially, a meander is formed by two noncrossing perfect matchings, above and below the same endpoints, which form a single closed loop. We prove that meanders are connected under appropriate pairs of balanced local moves, one operating on and the other on . We also prove that the subset of meanders with a fixed is connected under a suitable local move operating on an appropriately defined meandric triple in . We provide diameter bounds under such moves, tight up to a (worst case) factor of two. The mixing times of the Markov chains remain open
Large Deviations for Random Trees
We consider large random trees under Gibbs distributions and prove a Large
Deviation Principle (LDP) for the distribution of degrees of vertices of the
tree. The LDP rate function is given explicitly. An immediate consequence is a
Law of Large Numbers for the distribution of vertex degrees in a large random
tree. Our motivation for this study comes from the analysis of RNA secondary
structures.Comment: 10 page