2,432 research outputs found

    Improving Decoy Databases for Protein Folding Algorithms

    Get PDF
    Predicting protein structures and simulating protein folding motions are two of the most important problems in computational biology today. Modern folding simulation methods rely on a scoring function which attempts to distinguish the native structure (the most energetically stable 3D structure) from one or more non-native structures. Decoy databases are collections of non-native structures that are widely used to test and verify these scoring functions. We present a method to evaluate and improve the quality of decoy databases by adding novel structures and/or removing redundant structures. We test our approach on 13 different decoy databases of varying size and type and show significant improvement across a variety of metrics. The most improvement comes from the addition of novel structures indicating that our improved databases have more informative structures that are more likely to fool scoring functions. We also test our improved databases on a popular modern scoring function. We show that they contain a greater number of native-like structures than the original databases, thereby producing a more rigorous database for testing scoring functions. This work can aid the development and testing of better scoring functions, which in turn, will improve the quality of protein folding simulations

    Knowledge-based energy functions for computational studies of proteins

    Full text link
    This chapter discusses theoretical framework and methods for developing knowledge-based potential functions essential for protein structure prediction, protein-protein interaction, and protein sequence design. We discuss in some details about the Miyazawa-Jernigan contact statistical potential, distance-dependent statistical potentials, as well as geometric statistical potentials. We also describe a geometric model for developing both linear and non-linear potential functions by optimization. Applications of knowledge-based potential functions in protein-decoy discrimination, in protein-protein interactions, and in protein design are then described. Several issues of knowledge-based potential functions are finally discussed.Comment: 57 pages, 6 figures. To be published in a book by Springe

    Empirical Potential Function for Simplified Protein Models: Combining Contact and Local Sequence-Structure Descriptors

    Full text link
    An effective potential function is critical for protein structure prediction and folding simulation. Simplified protein models such as those requiring only CαC_\alpha or backbone atoms are attractive because they enable efficient search of the conformational space. We show residue specific reduced discrete state models can represent the backbone conformations of proteins with small RMSD values. However, no potential functions exist that are designed for such simplified protein models. In this study, we develop optimal potential functions by combining contact interaction descriptors and local sequence-structure descriptors. The form of the potential function is a weighted linear sum of all descriptors, and the optimal weight coefficients are obtained through optimization using both native and decoy structures. The performance of the potential function in test of discriminating native protein structures from decoys is evaluated using several benchmark decoy sets. Our potential function requiring only backbone atoms or CαC_\alpha atoms have comparable or better performance than several residue-based potential functions that require additional coordinates of side chain centers or coordinates of all side chain atoms. By reducing the residue alphabets down to size 5 for local structure-sequence relationship, the performance of the potential function can be further improved. Our results also suggest that local sequence-structure correlation may play important role in reducing the entropic cost of protein folding.Comment: 20 pages, 5 figures, 4 tables. In press, Protein

    Protein folding using contact maps

    Full text link
    We present the development of the idea to use dynamics in the space of contact maps as a computational approach to the protein folding problem. We first introduce two important technical ingredients, the reconstruction of a three dimensional conformation from a contact map and the Monte Carlo dynamics in contact map space. We then discuss two approximations to the free energy of the contact maps and a method to derive energy parameters based on perceptron learning. Finally we present results, first for predictions based on threading and then for energy minimization of crambin and of a set of 6 immunoglobulins. The main result is that we proved that the two simple approximations we studied for the free energy are not suitable for protein folding. Perspectives are discussed in the last section.Comment: 29 pages, 10 figure

    Using the Unfolded State as the Reference State Improves the Performance of Statistical Potentials

    Get PDF
    AbstractDistance-dependent statistical potentials are an important class of energy functions extensively used in modeling protein structures and energetics. These potentials are obtained by statistically analyzing the proximity of atoms in all combinatorial amino-acid pairs in proteins with known structures. In model evaluation, the statistical potential is usually subtracted by the value of a reference state for better selectivity. An ideal reference state should include the general chemical properties of polypeptide chains so that only the unique factors stabilizing the native structures are retained after calibrating on reference state. However, reference states available as of this writing rarely model specific chemical constraints of peptide bonds and therefore poorly reflect the behavior of polypeptide chains. In this work, we proposed a statistical potential based on unfolded state ensemble (SPOUSE), where the reference state is summarized from the unfolded state ensembles of proteins produced according to the statistical coil model. Due to its better representation of the features of polypeptides, SPOUSE outperforms three of the most widely used distance-dependent potentials not only in native conformation identification, but also in the selection of close-to-native models and correlation coefficients between energy and model error. Furthermore, SPOUSE shows promising possibility of further improvement by integration with the orientation-dependent side-chain potentials

    A pairwise residue contact area-based mean force potential for discrimination of native protein structure

    Get PDF
    Abstract Background Considering energy function to detect a correct protein fold from incorrect ones is very important for protein structure prediction and protein folding. Knowledge-based mean force potentials are certainly the most popular type of interaction function for protein threading. They are derived from statistical analyses of interacting groups in experimentally determined protein structures. These potentials are developed at the atom or the amino acid level. Based on orientation dependent contact area, a new type of knowledge-based mean force potential has been developed. Results We developed a new approach to calculate a knowledge-based potential of mean-force, using pairwise residue contact area. To test the performance of our approach, we performed it on several decoy sets to measure its ability to discriminate native structure from decoys. This potential has been able to distinguish native structures from the decoys in the most cases. Further, the calculated Z-scores were quite high for all protein datasets. Conclusions This knowledge-based potential of mean force can be used in protein structure prediction, fold recognition, comparative modelling and molecular recognition. The program is available at http://www.bioinf.cs.ipm.ac.ir/softwares/surfield</p
    • …
    corecore