3,380 research outputs found
Enumeration of RNA structures by Matrix Models
We enumerate the number of RNA contact structures according to their genus,
i.e. the topological character of their pseudoknots. By using a recently
proposed matrix model formulation for the RNA folding problem, we obtain exact
results for the simple case of an RNA molecule with an infinitely flexible
backbone, in which any arbitrary pair of bases is allowed. We analyze the
distribution of the genus of pseudoknots as a function of the total number of
nucleotides along the phosphate-sugar backbone.Comment: RevTeX, 4 pages, 2 figure
Similarity-Detection and Localization
The detection of similarities between long DNA and protein sequences is
studied using concepts of statistical physics. It is shown that mutual
similarities can be detected by sequence alignment methods only if their amount
exceeds a threshold value. The onset of detection is a continuous phase
transition which can be viewed as a localization-delocalization transition. The
``fidelity'' of the alignment is the order parameter of that transition; it
leads to criteria for the selection of optimal alignment parameters.Comment: 4 pages including 4 figures (308kb post-script file
Model for Folding and Aggregation in RNA Secondary Structures
We study the statistical mechanics of RNA secondary structures designed to
have an attraction between two different types of structures as a model system
for heteropolymer aggregation. The competition between the branching entropy of
the secondary structure and the energy gained by pairing drives the RNA to
undergo a `temperature independent' second order phase transition from a molten
to an aggregated phase'. The aggregated phase thus obtained has a
macroscopically large number of contacts between different RNAs. The partition
function scaling exponent for this phase is \theta ~ 1/2 and the crossover
exponent of the phase transition is \nu ~ 5/3. The relevance of these
calculations to the aggregation of biological molecules is discussed.Comment: Revtex, 4 pages; 3 Figures; Final published versio
Quantification of the differences between quenched and annealed averaging for RNA secondary structures
The analytical study of disordered system is usually difficult due to the
necessity to perform a quenched average over the disorder. Thus, one may resort
to the easier annealed ensemble as an approximation to the quenched system. In
the study of RNA secondary structures, we explicitly quantify the deviation of
this approximation from the quenched ensemble by looking at the correlations
between neighboring bases. This quantified deviation then allows us to propose
a constrained annealed ensemble which predicts physical quantities much closer
to the results of the quenched ensemble without becoming technically
intractable.Comment: 9 pages, 14 figures, submitted to Phys. Rev.
A New Simulated Annealing Algorithm for the Multiple Sequence Alignment Problem: The approach of Polymers in a Random Media
We proposed a probabilistic algorithm to solve the Multiple Sequence
Alignment problem. The algorithm is a Simulated Annealing (SA) that exploits
the representation of the Multiple Alignment between sequences as a
directed polymer in dimensions. Within this representation we can easily
track the evolution in the configuration space of the alignment through local
moves of low computational cost. At variance with other probabilistic
algorithms proposed to solve this problem, our approach allows for the creation
and deletion of gaps without extra computational cost. The algorithm was tested
aligning proteins from the kinases family. When D=3 the results are consistent
with those obtained using a complete algorithm. For where the complete
algorithm fails, we show that our algorithm still converges to reasonable
alignments. Moreover, we study the space of solutions obtained and show that
depending on the number of sequences aligned the solutions are organized in
different ways, suggesting a possible source of errors for progressive
algorithms.Comment: 7 pages and 11 figure
Nature of the glassy phase of RNA secondary structure
We characterize the low temperature phase of a simple model for RNA secondary
structures by determining the typical energy scale E(l) of excitations
involving l bases. At zero temperature, we find a scaling law E(l) \sim
l^\theta with \theta \approx 0.23, and this same scaling holds at low enough
temperatures. Above a critical temperature, there is a different phase
characterized by a relatively flat free energy landscape resembling that of a
homopolymer with a scaling exponent \theta=1. These results strengthen the
evidence in favour of the existence of a glass phase at low temperatures.Comment: 7 pages, 1 figur
Exact solution of the Bernoulli matching model of sequence alignment
Through a series of exact mappings we reinterpret the Bernoulli model of
sequence alignment in terms of the discrete-time totally asymmetric exclusion
process with backward sequential update and step function initial condition.
Using earlier results from the Bethe ansatz we obtain analytically the exact
distribution of the length of the longest common subsequence of two sequences
of finite lengths . Asymptotic analysis adapted from random matrix theory
allows us to derive the thermodynamic limit directly from the finite-size
result.Comment: 13 pages, 4 figure
An O(n^3)-Time Algorithm for Tree Edit Distance
The {\em edit distance} between two ordered trees with vertex labels is the
minimum cost of transforming one tree into the other by a sequence of
elementary operations consisting of deleting and relabeling existing nodes, as
well as inserting new nodes. In this paper, we present a worst-case
-time algorithm for this problem, improving the previous best
-time algorithm~\cite{Klein}. Our result requires a novel
adaptive strategy for deciding how a dynamic program divides into subproblems
(which is interesting in its own right), together with a deeper understanding
of the previous algorithms for the problem. We also prove the optimality of our
algorithm among the family of \emph{decomposition strategy} algorithms--which
also includes the previous fastest algorithms--by tightening the known lower
bound of ~\cite{Touzet} to , matching our
algorithm's running time. Furthermore, we obtain matching upper and lower
bounds of when the two trees have
different sizes and~, where .Comment: 10 pages, 5 figures, 5 .tex files where TED.tex is the main on
- …