5,948 research outputs found

    Inferring stabilizing mutations from protein phylogenies : application to influenza hemagglutinin

    Get PDF
    One selection pressure shaping sequence evolution is the requirement that a protein fold with sufficient stability to perform its biological functions. We present a conceptual framework that explains how this requirement causes the probability that a particular amino acid mutation is fixed during evolution to depend on its effect on protein stability. We mathematically formalize this framework to develop a Bayesian approach for inferring the stability effects of individual mutations from homologous protein sequences of known phylogeny. This approach is able to predict published experimentally measured mutational stability effects (ΔΔG values) with an accuracy that exceeds both a state-of-the-art physicochemical modeling program and the sequence-based consensus approach. As a further test, we use our phylogenetic inference approach to predict stabilizing mutations to influenza hemagglutinin. We introduce these mutations into a temperature-sensitive influenza virus with a defect in its hemagglutinin gene and experimentally demonstrate that some of the mutations allow the virus to grow at higher temperatures. Our work therefore describes a powerful new approach for predicting stabilizing mutations that can be successfully applied even to large, complex proteins such as hemagglutinin. This approach also makes a mathematical link between phylogenetics and experimentally measurable protein properties, potentially paving the way for more accurate analyses of molecular evolution

    DDGun: An untrained method for the prediction of protein stability changes upon single and multiple point variations

    Get PDF
    Background: Predicting the effect of single point variations on protein stability constitutes a crucial step toward understanding the relationship between protein structure and function. To this end, several methods have been developed to predict changes in the Gibbs free energy of unfolding (\u3b4\u3b4G) between wild type and variant proteins, using sequence and structure information. Most of the available methods however do not exhibit the anti-symmetric prediction property, which guarantees that the predicted \u3b4\u3b4G value for a variation is the exact opposite of that predicted for the reverse variation, i.e., \u3b4\u3b4G(A \u2192 B) = -\u3b4\u3b4G(B \u2192 A), where A and B are amino acids. Results: Here we introduce simple anti-symmetric features, based on evolutionary information, which are combined to define an untrained method, DDGun (DDG untrained). DDGun is a simple approach based on evolutionary information that predicts the \u3b4\u3b4G for single and multiple variations from sequence and structure information (DDGun3D). Our method achieves remarkable performance without any training on the experimental datasets, reaching Pearson correlation coefficients between predicted and measured \u3b4\u3b4G values of ~ 0.5 and ~ 0.4 for single and multiple site variations, respectively. Surprisingly, DDGun performances are comparable with those of state of the art methods. DDGun also naturally predicts multiple site variations, thereby defining a benchmark method for both single site and multiple site predictors. DDGun is anti-symmetric by construction predicting the value of the \u3b4\u3b4G of a reciprocal variation as almost equal (depending on the sequence profile) to -\u3b4\u3b4G of the direct variation. This is a valuable property that is missing in the majority of the methods. Conclusions: Evolutionary information alone combined in an untrained method can achieve remarkably high performances in the prediction of \u3b4\u3b4G upon protein mutation. Non-trained approaches like DDGun represent a valid benchmark both for scoring the predictive power of the individual features and for assessing the learning capability of supervised methods

    Understanding Stability of Protein-Protein Complexes

    Get PDF
    For all living organisms, macromolecular interactions facilitate most of their natural functions. Alterations to macromolecular structures through mutations, can affect the stability of their interactions, which may lead to unfavourable phenotypes and disease. Presented here, are a number of computational methods aimed at uncovering the principles behind complex stability - as described by binding affinity and dissociation rate constants. Several factors are known to govern the stability of protein-protein interactions, however, no one factor dominates, and it is the synergistic effect of a number of contributions, which amount to the affinity, and stability of a complex. The characterization of complex stability can thus be presented as a two-fold problem; modelling the individual factors and modelling the synergistic effect of the combination of such individual factors. Using machine learning as a central framework, empirical functions are designed for estimating affinity, dissociation rates and the effects of mutations on these properties. The performance of all models is in turn benchmarked on experimental data available from the literature and carefully curated datasets. Firstly, a wild-type binding free energy prediction model is designed, composed of a diverse set of stability descriptors, which account for flexibility and conformational changes undergone by the complex in question. Similarly, models for estimating the effects of mutations on binding affinity are also designed and benchmarked in a community-wide blind trial. Emphasis here is on the detection of a small subset of mutations that are able to enhance the stability of two de novo protein drugs targeting the flu virus hemagglutinin. Probing further the determinants of stability, a set of descriptors that link hotspot residues with the off-rate of a complex are designed, and applied to models predicting changes in off-rate upon mutation. Finally, the relationship between the distribution of hotspots at protein interfaces, and the rate of dissociation of such interfaces, is investigated
    • …
    corecore