5,328 research outputs found

    Maximum expected accuracy structural neighbors of an RNA secondary structure

    Get PDF
    International audienceBACKGROUND: Since RNA molecules regulate genes and control alternative splicing by allostery, it is important to develop algorithms to predict RNA conformational switches. Some tools, such as paRNAss, RNAshapes and RNAbor, can be used to predict potential conformational switches; nevertheless, no existent tool can detect general (i.e., not family specific) entire riboswitches (both aptamer and expression platform) with accuracy. Thus, the development of additional algorithms to detect conformational switches seems important, especially since the difference in free energy between the two metastable secondary structures may be as large as 15-20 kcal/mol. It has recently emerged that RNA secondary structure can be more accurately predicted by computing the maximum expected accuracy (MEA) structure, rather than the minimum free energy (MFE) structure. RESULTS: Given an arbitrary RNA secondary structure Sā‚€ for an RNA nucleotide sequence a = aā‚,..., a(n), we say that another secondary structure S of a is a k-neighbor of Sā‚€, if the base pair distance between Sā‚€ and S is k. In this paper, we prove that the Boltzmann probability of all k-neighbors of the minimum free energy structure Sā‚€ can be approximated with accuracy Īµ and confidence 1 - p, simultaneously for all 0 ā‰¤ k N(Īµ,p,K)=Ī¦ā»Ā¹(p/2K)Ā²/4ĪµĀ², where Ī¦(z) is the cumulative distribution function (CDF) for the standard normal distribution. We go on to describe the algorithm RNAborMEA, which for an arbitrary initial structure Sā‚€ and for all values 0 ā‰¤ k < K, computes the secondary structure MEA(k), having maximum expected accuracy over all k-neighbors of Sā‚€. Computation time is O(nĀ³ * KĀ²), and memory requirements are O(nĀ² * K). We analyze a sample TPP riboswitch, and apply our algorithm to the class of purine riboswitches. CONCLUSIONS: The approximation of RNAbor by sampling, with rigorous bound on accuracy, together with the computation of maximum expected accuracy k-neighbors by RNAborMEA, provide additional tools toward conformational switch detection. Results from RNAborMEA are quite distinct from other tools, such as RNAbor, RNAshapes and paRNAss, hence may provide orthogonal information when looking for suboptimal structures or conformational switches. Source code for RNAborMEA can be downloaded from http://sourceforge.net/projects/rnabormea/ or http://bioinformatics.bc.edu/clotelab/RNAborMEA/

    Statistical properties of neutral evolution

    Full text link
    Neutral evolution is the simplest model of molecular evolution and thus it is most amenable to a comprehensive theoretical investigation. In this paper, we characterize the statistical properties of neutral evolution of proteins under the requirement that the native state remains thermodynamically stable, and compare them to the ones of Kimura's model of neutral evolution. Our study is based on the Structurally Constrained Neutral (SCN) model which we recently proposed. We show that, in the SCN model, the substitution rate decreases as longer time intervals are considered, and fluctuates strongly from one branch of the evolutionary tree to another, leading to a non-Poissonian statistics for the substitution process. Such strong fluctuations are also due to the fact that neutral substitution rates for individual residues are strongly correlated for most residue pairs. Interestingly, structurally conserved residues, characterized by a much below average substitution rate, are also much less correlated to other residues and evolve in a much more regular way. Our results could improve methods aimed at distinguishing between neutral and adaptive substitutions as well as methods for computing the expected number of substitutions occurred since the divergence of two protein sequences.Comment: 17 pages, 11 figure

    DISPLAR: an accurate method for predicting DNA-binding sites on protein surfaces

    Get PDF
    Structural and physical properties of DNA provide important constraints on the binding sites formed on surfaces of DNA-targeting proteins. Characteristics of such binding sites may form the basis for predicting DNA-binding sites from the structures of proteins alone. Such an approach has been successfully developed for predicting proteinā€“protein interface. Here this approach is adapted for predicting DNA-binding sites. We used a representative set of 264 proteinā€“DNA complexes from the Protein Data Bank to analyze characteristics and to train and test a neural network predictor of DNA-binding sites. The input to the predictor consisted of PSI-blast sequence profiles and solvent accessibilities of each surface residue and 14 of its closest neighboring residues. Predicted DNA-contacting residues cover 60% of actual DNA-contacting residues and have an accuracy of 76%. This method significantly outperforms previous attempts of DNA-binding site predictions. Its application to the prion protein yielded a DNA-binding site that is consistent with recent NMR chemical shift perturbation data, suggesting that it can complement experimental techniques in characterizing proteinā€“DNA interfaces

    Computational investigations of structure probing experiments for RNA structure prediction

    Get PDF
    Ribonucleic acids (RNA) transcripts, and in particular non-coding RNAs, play fundamental roles in cellular metabolism, as they are involved in protein synthesis, catalysis, and regulation of gene expression. In some cases, an RNA\u2019s biological function is mostly dependent on a specific active conformation, making the identification of this single stable structure crucial to identify the role of the RNA and the relationships between its mutations and diseases. On the other hand, RNAs are often found in a dynamic equilibrium of multiple interconverting conformations, that is necessary to regulate their functional activity. In these cases it becomes fundamental to gain knowledge of RNA\u2019s structural ensembles, in order to fully determine its mechanism of action. The current structure determination techniques, both for single-state models such as X-ray crystallography, and for multi-state models such as nuclear magnetic resonance and single-molecule methods, despite proving accurate and reliable in many cases, are extremely slow and costly. In contrast, chemical probing is a class of experimental techniques that provide structural information at single-nucleotide resolution at significantly lower costs in terms of time and required infrastructures. In particular, selective 2\u2032 hydroxyl acylation analyzed via primer extension (SHAPE) has proved a valid chemical mapping technique to probe RNA structure even in vivo. This thesis reports a systematic investi- gation of chemical probing experiments based on two different approaches. The first approach, presented in Chapter 2, relies on machine-learning techniques to optimize a model for mapping experimental data into structural information. The model relies also on co-evolutionary data, in the form of direct coupling analysis (DCA) couplings. The inclusion of this kind of data is chosen in the same spirit of reducing the costs of structure probing, as co-evolutionary analysis relies only on sequencing techniques. The resulting model is proposed as a candidate standard tool for prediction of RNA secondary structure, and some insight in the mechanism of chemical probing is gained by interpreting back its features. Importantly, this work has been developed in the per- spective of building a framework for future refinement and improvement. In this spirit, all the used data and scripts are available at https://github.com/bussilab/shape-dca-data, and the model can be easily retrained and adapted to incorporate arbitrary experimental informa- tion. As the interpretation of the model features suggests the possible emergence of cooperative effects involving RNA nucleotides interacting with SHAPE reagents, a second approach based on Molecular Dynamics simulations is proposed to investigate this hypothesis. The results, along with an originally developed methodology to analyse Molecular Dynamics simulations at variable number of particles, are presented in Chapter 3
    • ā€¦
    corecore