159 research outputs found

    Knowledge-based energy functions for computational studies of proteins

    Full text link
    This chapter discusses theoretical framework and methods for developing knowledge-based potential functions essential for protein structure prediction, protein-protein interaction, and protein sequence design. We discuss in some details about the Miyazawa-Jernigan contact statistical potential, distance-dependent statistical potentials, as well as geometric statistical potentials. We also describe a geometric model for developing both linear and non-linear potential functions by optimization. Applications of knowledge-based potential functions in protein-decoy discrimination, in protein-protein interactions, and in protein design are then described. Several issues of knowledge-based potential functions are finally discussed.Comment: 57 pages, 6 figures. To be published in a book by Springe

    Inferring stabilizing mutations from protein phylogenies : application to influenza hemagglutinin

    Get PDF
    One selection pressure shaping sequence evolution is the requirement that a protein fold with sufficient stability to perform its biological functions. We present a conceptual framework that explains how this requirement causes the probability that a particular amino acid mutation is fixed during evolution to depend on its effect on protein stability. We mathematically formalize this framework to develop a Bayesian approach for inferring the stability effects of individual mutations from homologous protein sequences of known phylogeny. This approach is able to predict published experimentally measured mutational stability effects (ΔΔG values) with an accuracy that exceeds both a state-of-the-art physicochemical modeling program and the sequence-based consensus approach. As a further test, we use our phylogenetic inference approach to predict stabilizing mutations to influenza hemagglutinin. We introduce these mutations into a temperature-sensitive influenza virus with a defect in its hemagglutinin gene and experimentally demonstrate that some of the mutations allow the virus to grow at higher temperatures. Our work therefore describes a powerful new approach for predicting stabilizing mutations that can be successfully applied even to large, complex proteins such as hemagglutinin. This approach also makes a mathematical link between phylogenetics and experimentally measurable protein properties, potentially paving the way for more accurate analyses of molecular evolution

    Equivalent glycemic load (EGL): a method for quantifying the glycemic responses elicited by low carbohydrate foods

    Get PDF
    BACKGROUND: Glycemic load (GL) is used to quantify the glycemic impact of high-carbohydrate (CHO) foods, but cannot be used for low-CHO foods. Therefore, we evaluated the accuracy of equivalent-glycemic-load (EGL), a measure of the glycemic impact of low-CHO foods defined as the amount of CHO from white-bread (WB) with the same glycemic impact as one serving of food. METHODS: Several randomized, cross-over trials were performed by a contract research organization using overnight-fasted healthy subjects drawn from a pool of 63 recruited from the general population by newspaper advertisement. Incremental blood-glucose response area-under-the-curve (AUC) elicited by 0, 5, 10, 20, 35 and 50 g CHO portions of WB (WB-CHO) and 3, 5, 10 and 20 g glucose were measured. EGL values of the different doses of glucose and WB and 4 low-CHO foods were determined as: EGL = (F-B)/M, where F is AUC after food and B is y-intercept and M slope of the regression of AUC on grams WB-CHO. The dose-response curves of WB and glucose were used to derive an equation to estimate GL from EGL, and the resulting values compared to GL calculated from the glucose dose-response curve. The accuracy of EGL was assessed by comparing the GL (estimated from EGL) values of the 4 doses of oral-glucose with the amounts actually consumed. RESULTS: Over 0–50 g WB-CHO (n = 10), the dose-response curve was non-linear, but over the range 0–20 g the curve was indistinguishable from linear, with AUC after 0, 5, 10 and 20 g WB-CHO, 10 ± 1, 28 ± 2, 58 ± 5 and 100 ± 6 mmol × min/L, differing significantly from each other (n = 48). The difference between GL values estimated from EGL and those calculated from the dose-response curve was 0 g (95% confidence-interval, ± 0.5 g). The difference between the GL values of the 4 doses of glucose estimated from EGL, and the amounts of glucose actually consumed was 0.2 g (95% confidence-interval, ± 1 g). CONCLUSION: EGL, a measure of the glycemic impact of low-carbohydrate foods, is valid across the range of 0–20 g CHO, accurate to within 1 g, and at least sensitive enough to detect a glycemic response equivalent to that produced by 3 g oral-glucose in 10 subjects

    The Energy Computation Paradox and ab initio Protein Folding

    Get PDF
    The routine prediction of three-dimensional protein structure from sequence remains a challenge in computational biochemistry. It has been intuited that calculated energies from physics-based scoring functions are able to distinguish native from nonnative folds based on previous performance with small proteins and that conformational sampling is the fundamental bottleneck to successful folding. We demonstrate that as protein size increases, errors in the computed energies become a significant problem. We show, by using error probability density functions, that physics-based scores contain significant systematic and random errors relative to accurate reference energies. These errors propagate throughout an entire protein and distort its energy landscape to such an extent that modern scoring functions should have little chance of success in finding the free energy minima of large proteins. Nonetheless, by understanding errors in physics-based score functions, they can be reduced in a post-hoc manner, improving accuracy in energy computation and fold discrimination

    Mechanism of Protein Kinetic Stabilization by Engineered Disulfide Crosslinks

    Get PDF
    The impact of disulfide bonds on protein stability goes beyond simple equilibrium thermodynamics effects associated with the conformational entropy of the unfolded state. Indeed, disulfide crosslinks may play a role in the prevention of dysfunctional association and strongly affect the rates of irreversible enzyme inactivation, highly relevant in biotechnological applications. While these kinetic-stability effects remain poorly understood, by analogy with proposed mechanisms for processes of protein aggregation and fibrillogenesis, we propose that they may be determined by the properties of sparsely-populated, partially-unfolded intermediates. Here we report the successful design, on the basis of high temperature molecular-dynamics simulations, of six thermodynamically and kinetically stabilized variants of phytase from Citrobacter braakii (a biotechnologically important enzyme) with one, two or three engineered disulfides. Activity measurements and 3D crystal structure determination demonstrate that the engineered crosslinks do not cause dramatic alterations in the native structure. The inactivation kinetics for all the variants displays a strongly non-Arrhenius temperature dependence, with the time-scale for the irreversible denaturation process reaching a minimum at a given temperature within the range of the denaturation transition. We show this striking feature to be a signature of a key role played by a partially unfolded, intermediate state/ensemble. Energetic and mutational analyses confirm that the intermediate is highly unfolded (akin to a proposed critical intermediate in the misfolding of the prion protein), a result that explains the observed kinetic stabilization. Our results provide a rationale for the kinetic-stability consequences of disulfide-crosslink engineering and an experimental methodology to arrive at energetic/structural descriptions of the sparsely populated and elusive intermediates that play key roles in irreversible protein denaturation.This work was supported by grants BIO2009-09562, CSD2009-00088 from the Spanish Ministry of Science and Innovation, and FEDER Funds (JMS-R)

    Local Alignment Refinement Using Structural Assessment

    Get PDF
    Homology modeling is the most commonly used technique to build a three-dimensional model for a protein sequence. It heavily relies on the quality of the sequence alignment between the protein to model and related proteins with a known three dimensional structure. Alignment quality can be assessed according to the physico-chemical properties of the three dimensional models it produces

    A Novel Side-Chain Orientation Dependent Potential Derived from Random-Walk Reference State for Protein Fold Selection and Structure Prediction

    Get PDF
    An accurate potential function is essential to attack protein folding and structure prediction problems. The key to developing efficient knowledge-based potential functions is to design reference states that can appropriately counteract generic interactions. The reference states of many knowledge-based distance-dependent atomic potential functions were derived from non-interacting particles such as ideal gas, however, which ignored the inherent sequence connectivity and entropic elasticity of proteins.We developed a new pair-wise distance-dependent, atomic statistical potential function (RW), using an ideal random-walk chain as reference state, which was optimized on CASP models and then benchmarked on nine structural decoy sets. Second, we incorporated a new side-chain orientation-dependent energy term into RW (RWplus) and found that the side-chain packing orientation specificity can further improve the decoy recognition ability of the statistical potential.RW and RWplus demonstrate a significantly better ability than the best performing pair-wise distance-dependent atomic potential functions in both native and near-native model selections. It has higher energy-RMSD and energy-TM-score correlations compared with other potentials of the same type in real-life structure assembly decoys. When benchmarked with a comprehensive list of publicly available potentials, RW and RWplus shows comparable performance to the state-of-the-art scoring functions, including those combining terms from multiple resources. These data demonstrate the usefulness of random-walk chain as reference states which correctly account for sequence connectivity and entropic elasticity of proteins. It shows potential usefulness in structure recognition and protein folding simulations. The RW and RWplus potentials, as well as the newly generated I-TASSER decoys, are freely available in http://zhanglab.ccmb.med.umich.edu/RW

    A Search for Energy Minimized Sequences of Proteins

    Get PDF
    In this paper, we present numerical evidence that supports the notion of minimization in the sequence space of proteins for a target conformation. We use the conformations of the real proteins in the Protein Data Bank (PDB) and present computationally efficient methods to identify the sequences with minimum energy. We use edge-weighted connectivity graph for ranking the residue sites with reduced amino acid alphabet and then use continuous optimization to obtain the energy-minimizing sequences. Our methods enable the computation of a lower bound as well as a tight upper bound for the energy of a given conformation. We validate our results by using three different inter-residue energy matrices for five proteins from protein data bank (PDB), and by comparing our energy-minimizing sequences with 80 million diverse sequences that are generated based on different considerations in each case. When we submitted some of our chosen energy-minimizing sequences to Basic Local Alignment Search Tool (BLAST), we obtained some sequences from non-redundant protein sequence database that are similar to ours with an E-value of the order of 10-7. In summary, we conclude that proteins show a trend towards minimizing energy in the sequence space but do not seem to adopt the global energy-minimizing sequence. The reason for this could be either that the existing energy matrices are not able to accurately represent the inter-residue interactions in the context of the protein environment or that Nature does not push the optimization in the sequence space, once it is able to perform the function

    Mechanistic Insight into the Reactivation of BCAII Enzyme from Denatured and Molten Globule States by Eukaryotic Ribosomes and Domain V rRNAs

    Get PDF
    In all life forms, decoding of messenger-RNA into polypeptide chain is accomplished by the ribosome. Several protein chaperones are known to bind at the exit of ribosomal tunnel to ensure proper folding of the nascent chain by inhibiting their premature folding in the densely crowded environment of the cell. However, accumulating evidence suggests that ribosome may play a chaperone role in protein folding events in vitro. Ribosome-mediated folding of denatured proteins by prokaryotic ribosomes has been studied extensively. The RNA-assisted chaperone activity of the prokaryotic ribosome has been attributed to the domain V, a span of 23S rRNA at the intersubunit side of the large subunit encompassing the Peptidyl Transferase Centre. Evidently, this functional property of ribosome is unrelated to the nascent chain protein folding at the exit of the ribosomal tunnel. Here, we seek to scrutinize whether this unique function is conserved in a primitive kinetoplastid group of eukaryotic species Leishmania donovani where the ribosome structure possesses distinct additional features and appears markedly different compared to other higher eukaryotic ribosomes. Bovine Carbonic Anhydrase II (BCAII) enzyme was considered as the model protein. Our results manifest that domain V of the large subunit rRNA of Leishmania ribosomes preserves chaperone activity suggesting that ribosome-mediated protein folding is, indeed, a conserved phenomenon. Further, we aimed to investigate the mechanism underpinning the ribosome-assisted protein reactivation process. Interestingly, the surface plasmon resonance binding analyses exhibit that rRNA guides productive folding by directly interacting with molten globule-like states of the protein. In contrast, native protein shows no notable affinity to the rRNA. Thus, our study not only confirms conserved, RNA-mediated chaperoning role of ribosome but also provides crucial insight into the mechanism of the process
    corecore