2,993,972 research outputs found

    Evolutionary Dynamics and Optimization: Neutral Networks as Model-Landscapes for RNA Secondary-Structure Folding-Landscapes

    Full text link
    We view the folding of RNA-sequences as a map that assigns a pattern of base pairings to each sequence, known as secondary structure. These preimages can be constructed as random graphs (i.e. the neutral networks associated to the structure ss). By interpreting the secondary structure as biological information we can formulate the so called Error Threshold of Shapes as an extension of Eigen's et al. concept of an error threshold in the single peak landscape. Analogue to the approach of Derrida & Peliti for a of the population on the neutral network. On the one hand this model of a single shape landscape allows the derivation of analytical results, on the other hand the concept gives rise to study various scenarios by means of simulations, e.g. the interaction of two different networks. It turns out that the intersection of two sets of compatible sequences (with respect to the pair of secondary structures) plays a key role in the search for ''fitter'' secondary structures.Comment: 20 pages, uuencoded compressed postscript-file, Proc. of ECAL '95 conference, to appear., email: chris @ imb-jena.d

    Protein secondary structure: Entropy, correlations and prediction

    Get PDF
    Is protein secondary structure primarily determined by local interactions between residues closely spaced along the amino acid backbone, or by non-local tertiary interactions? To answer this question we have measured the entropy densities of primary structure and secondary structure sequences, and the local inter-sequence mutual information density. We find that the important inter-sequence interactions are short ranged, that correlations between neighboring amino acids are essentially uninformative, and that only 1/4 of the total information needed to determine the secondary structure is available from local inter-sequence correlations. Since the remaining information must come from non-local interactions, this observation supports the view that the majority of most proteins fold via a cooperative process where secondary and tertiary structure form concurrently. To provide a more direct comparison to existing secondary structure prediction methods, we construct a simple hidden Markov model (HMM) of the sequences. This HMM achieves a prediction accuracy comparable to other single sequence secondary structure prediction algorithms, and can extract almost all of the inter-sequence mutual information. This suggests that these algorithms are almost optimal, and that we should not expect a dramatic improvement in prediction accuracy. However, local correlations between secondary and primary structure are probably of under-appreciated importance in many tertiary structure prediction methods, such as threading.Comment: 8 pages, 5 figure

    Parametrized Stochastic Grammars for RNA Secondary Structure Prediction

    Full text link
    We propose a two-level stochastic context-free grammar (SCFG) architecture for parametrized stochastic modeling of a family of RNA sequences, including their secondary structure. A stochastic model of this type can be used for maximum a posteriori estimation of the secondary structure of any new sequence in the family. The proposed SCFG architecture models RNA subsequences comprising paired bases as stochastically weighted Dyck-language words, i.e., as weighted balanced-parenthesis expressions. The length of each run of unpaired bases, forming a loop or a bulge, is taken to have a phase-type distribution: that of the hitting time in a finite-state Markov chain. Without loss of generality, each such Markov chain can be taken to have a bounded complexity. The scheme yields an overall family SCFG with a manageable number of parameters.Comment: 5 pages, submitted to the 2007 Information Theory and Applications Workshop (ITA 2007

    A model for the force stretching double-stranded chain molecules

    Full text link
    We modify and extend the recently developed statistical mechanical model for predicting the thermodynamic properties of chain molecules having noncovalent double-stranded conformations, as in RNA or ssDNA, and β\beta-sheets in protein, by including the constant force stretching at one end of molecules as in a typical single-molecule experiment. The conformations of double-stranded regions of the chain are calculated based on polymer graph-theoretic approach [S-J. Chen and K. A. Dill, J. Chem. Phys. {\bf109}, 4602(1998)], while the unpaired single-stranded regions are treated as self-avoiding walks. Sequence dependence and excluded volume interaction are taken into account explicitly. Two classes of conformations, hairpin and RNA secondary structure are explored. For the hairpin conformations, all possible end-to-end distances corresponding to the different types of double-stranded regions are enumerated exactly. For the RNA secondary structure conformations, a new recursive formula incorporating the secondary structure and end-to-end distribution has been derived. Using the model, we investigate the extension-force curves, contact and population distributions and re-entering phenomena, respectively. we find that the force stretching homogeneous chains of hairpin and secondary structure conformations are very different: the unfolding of hairpins is two-state, while unfolding the latter is one-state. In addition, re-entering transitions only present in hairpin conformations, but are not observed in secondary structure conformations.Comment: 19 pages, 28 figure

    RNA secondary structure prediction from multi-aligned sequences

    Full text link
    It has been well accepted that the RNA secondary structures of most functional non-coding RNAs (ncRNAs) are closely related to their functions and are conserved during evolution. Hence, prediction of conserved secondary structures from evolutionarily related sequences is one important task in RNA bioinformatics; the methods are useful not only to further functional analyses of ncRNAs but also to improve the accuracy of secondary structure predictions and to find novel functional RNAs from the genome. In this review, I focus on common secondary structure prediction from a given aligned RNA sequence, in which one secondary structure whose length is equal to that of the input alignment is predicted. I systematically review and classify existing tools and algorithms for the problem, by utilizing the information employed in the tools and by adopting a unified viewpoint based on maximum expected gain (MEG) estimators. I believe that this classification will allow a deeper understanding of each tool and provide users with useful information for selecting tools for common secondary structure predictions.Comment: A preprint of an invited review manuscript that will be published in a chapter of the book `Methods in Molecular Biology'. Note that this version of the manuscript may differ from the published versio
    corecore