2,993,972 research outputs found
Evolutionary Dynamics and Optimization: Neutral Networks as Model-Landscapes for RNA Secondary-Structure Folding-Landscapes
We view the folding of RNA-sequences as a map that assigns a pattern of base
pairings to each sequence, known as secondary structure. These preimages can be
constructed as random graphs (i.e. the neutral networks associated to the
structure ). By interpreting the secondary structure as biological
information we can formulate the so called Error Threshold of Shapes as an
extension of Eigen's et al. concept of an error threshold in the single peak
landscape. Analogue to the approach of Derrida & Peliti for a of the population
on the neutral network. On the one hand this model of a single shape landscape
allows the derivation of analytical results, on the other hand the concept
gives rise to study various scenarios by means of simulations, e.g. the
interaction of two different networks. It turns out that the intersection of
two sets of compatible sequences (with respect to the pair of secondary
structures) plays a key role in the search for ''fitter'' secondary structures.Comment: 20 pages, uuencoded compressed postscript-file, Proc. of ECAL '95
conference, to appear., email: chris @ imb-jena.d
Protein secondary structure: Entropy, correlations and prediction
Is protein secondary structure primarily determined by local interactions
between residues closely spaced along the amino acid backbone, or by non-local
tertiary interactions? To answer this question we have measured the entropy
densities of primary structure and secondary structure sequences, and the local
inter-sequence mutual information density. We find that the important
inter-sequence interactions are short ranged, that correlations between
neighboring amino acids are essentially uninformative, and that only 1/4 of the
total information needed to determine the secondary structure is available from
local inter-sequence correlations. Since the remaining information must come
from non-local interactions, this observation supports the view that the
majority of most proteins fold via a cooperative process where secondary and
tertiary structure form concurrently. To provide a more direct comparison to
existing secondary structure prediction methods, we construct a simple hidden
Markov model (HMM) of the sequences. This HMM achieves a prediction accuracy
comparable to other single sequence secondary structure prediction algorithms,
and can extract almost all of the inter-sequence mutual information. This
suggests that these algorithms are almost optimal, and that we should not
expect a dramatic improvement in prediction accuracy. However, local
correlations between secondary and primary structure are probably of
under-appreciated importance in many tertiary structure prediction methods,
such as threading.Comment: 8 pages, 5 figure
Parametrized Stochastic Grammars for RNA Secondary Structure Prediction
We propose a two-level stochastic context-free grammar (SCFG) architecture
for parametrized stochastic modeling of a family of RNA sequences, including
their secondary structure. A stochastic model of this type can be used for
maximum a posteriori estimation of the secondary structure of any new sequence
in the family. The proposed SCFG architecture models RNA subsequences
comprising paired bases as stochastically weighted Dyck-language words, i.e.,
as weighted balanced-parenthesis expressions. The length of each run of
unpaired bases, forming a loop or a bulge, is taken to have a phase-type
distribution: that of the hitting time in a finite-state Markov chain. Without
loss of generality, each such Markov chain can be taken to have a bounded
complexity. The scheme yields an overall family SCFG with a manageable number
of parameters.Comment: 5 pages, submitted to the 2007 Information Theory and Applications
Workshop (ITA 2007
A model for the force stretching double-stranded chain molecules
We modify and extend the recently developed statistical mechanical model for
predicting the thermodynamic properties of chain molecules having noncovalent
double-stranded conformations, as in RNA or ssDNA, and sheets in
protein, by including the constant force stretching at one end of molecules as
in a typical single-molecule experiment. The conformations of double-stranded
regions of the chain are calculated based on polymer graph-theoretic approach
[S-J. Chen and K. A. Dill, J. Chem. Phys. {\bf109}, 4602(1998)], while the
unpaired single-stranded regions are treated as self-avoiding walks. Sequence
dependence and excluded volume interaction are taken into account explicitly.
Two classes of conformations, hairpin and RNA secondary structure are explored.
For the hairpin conformations, all possible end-to-end distances corresponding
to the different types of double-stranded regions are enumerated exactly. For
the RNA secondary structure conformations, a new recursive formula
incorporating the secondary structure and end-to-end distribution has been
derived. Using the model, we investigate the extension-force curves, contact
and population distributions and re-entering phenomena, respectively. we find
that the force stretching homogeneous chains of hairpin and secondary structure
conformations are very different: the unfolding of hairpins is two-state, while
unfolding the latter is one-state. In addition, re-entering transitions only
present in hairpin conformations, but are not observed in secondary structure
conformations.Comment: 19 pages, 28 figure
RNA secondary structure prediction from multi-aligned sequences
It has been well accepted that the RNA secondary structures of most
functional non-coding RNAs (ncRNAs) are closely related to their functions and
are conserved during evolution. Hence, prediction of conserved secondary
structures from evolutionarily related sequences is one important task in RNA
bioinformatics; the methods are useful not only to further functional analyses
of ncRNAs but also to improve the accuracy of secondary structure predictions
and to find novel functional RNAs from the genome. In this review, I focus on
common secondary structure prediction from a given aligned RNA sequence, in
which one secondary structure whose length is equal to that of the input
alignment is predicted. I systematically review and classify existing tools and
algorithms for the problem, by utilizing the information employed in the tools
and by adopting a unified viewpoint based on maximum expected gain (MEG)
estimators. I believe that this classification will allow a deeper
understanding of each tool and provide users with useful information for
selecting tools for common secondary structure predictions.Comment: A preprint of an invited review manuscript that will be published in
a chapter of the book `Methods in Molecular Biology'. Note that this version
of the manuscript may differ from the published versio
- …
