2,432 research outputs found
Improving Decoy Databases for Protein Folding Algorithms
Predicting protein structures and simulating protein folding motions are two of the most important problems in computational biology today. Modern folding simulation methods rely on a scoring function which attempts to distinguish the native structure (the most energetically stable 3D structure) from one or more non-native structures. Decoy databases are collections of non-native structures that are widely used to test and verify these scoring functions.
We present a method to evaluate and improve the quality of decoy databases by adding novel structures and/or removing redundant structures. We test our approach on 13 different decoy databases of varying size and type and show significant improvement across a variety of metrics. The most improvement comes from the addition of novel structures indicating that our improved databases have more informative structures that are more likely to fool scoring functions. We also test our improved databases on a popular modern scoring function. We show that they contain a greater number of native-like structures than the original databases, thereby producing a more rigorous database for testing scoring functions. This work can aid the development and testing of better scoring functions, which in turn, will improve the quality of protein folding simulations
Knowledge-based energy functions for computational studies of proteins
This chapter discusses theoretical framework and methods for developing
knowledge-based potential functions essential for protein structure prediction,
protein-protein interaction, and protein sequence design. We discuss in some
details about the Miyazawa-Jernigan contact statistical potential,
distance-dependent statistical potentials, as well as geometric statistical
potentials. We also describe a geometric model for developing both linear and
non-linear potential functions by optimization. Applications of knowledge-based
potential functions in protein-decoy discrimination, in protein-protein
interactions, and in protein design are then described. Several issues of
knowledge-based potential functions are finally discussed.Comment: 57 pages, 6 figures. To be published in a book by Springe
Empirical Potential Function for Simplified Protein Models: Combining Contact and Local Sequence-Structure Descriptors
An effective potential function is critical for protein structure prediction
and folding simulation. Simplified protein models such as those requiring only
or backbone atoms are attractive because they enable efficient
search of the conformational space. We show residue specific reduced discrete
state models can represent the backbone conformations of proteins with small
RMSD values. However, no potential functions exist that are designed for such
simplified protein models. In this study, we develop optimal potential
functions by combining contact interaction descriptors and local
sequence-structure descriptors. The form of the potential function is a
weighted linear sum of all descriptors, and the optimal weight coefficients are
obtained through optimization using both native and decoy structures. The
performance of the potential function in test of discriminating native protein
structures from decoys is evaluated using several benchmark decoy sets. Our
potential function requiring only backbone atoms or atoms have
comparable or better performance than several residue-based potential functions
that require additional coordinates of side chain centers or coordinates of all
side chain atoms. By reducing the residue alphabets down to size 5 for local
structure-sequence relationship, the performance of the potential function can
be further improved. Our results also suggest that local sequence-structure
correlation may play important role in reducing the entropic cost of protein
folding.Comment: 20 pages, 5 figures, 4 tables. In press, Protein
Protein folding using contact maps
We present the development of the idea to use dynamics in the space of
contact maps as a computational approach to the protein folding problem. We
first introduce two important technical ingredients, the reconstruction of a
three dimensional conformation from a contact map and the Monte Carlo dynamics
in contact map space. We then discuss two approximations to the free energy of
the contact maps and a method to derive energy parameters based on perceptron
learning. Finally we present results, first for predictions based on threading
and then for energy minimization of crambin and of a set of 6 immunoglobulins.
The main result is that we proved that the two simple approximations we studied
for the free energy are not suitable for protein folding. Perspectives are
discussed in the last section.Comment: 29 pages, 10 figure
Using the Unfolded State as the Reference State Improves the Performance of Statistical Potentials
AbstractDistance-dependent statistical potentials are an important class of energy functions extensively used in modeling protein structures and energetics. These potentials are obtained by statistically analyzing the proximity of atoms in all combinatorial amino-acid pairs in proteins with known structures. In model evaluation, the statistical potential is usually subtracted by the value of a reference state for better selectivity. An ideal reference state should include the general chemical properties of polypeptide chains so that only the unique factors stabilizing the native structures are retained after calibrating on reference state. However, reference states available as of this writing rarely model specific chemical constraints of peptide bonds and therefore poorly reflect the behavior of polypeptide chains. In this work, we proposed a statistical potential based on unfolded state ensemble (SPOUSE), where the reference state is summarized from the unfolded state ensembles of proteins produced according to the statistical coil model. Due to its better representation of the features of polypeptides, SPOUSE outperforms three of the most widely used distance-dependent potentials not only in native conformation identification, but also in the selection of close-to-native models and correlation coefficients between energy and model error. Furthermore, SPOUSE shows promising possibility of further improvement by integration with the orientation-dependent side-chain potentials
A pairwise residue contact area-based mean force potential for discrimination of native protein structure
Abstract Background Considering energy function to detect a correct protein fold from incorrect ones is very important for protein structure prediction and protein folding. Knowledge-based mean force potentials are certainly the most popular type of interaction function for protein threading. They are derived from statistical analyses of interacting groups in experimentally determined protein structures. These potentials are developed at the atom or the amino acid level. Based on orientation dependent contact area, a new type of knowledge-based mean force potential has been developed. Results We developed a new approach to calculate a knowledge-based potential of mean-force, using pairwise residue contact area. To test the performance of our approach, we performed it on several decoy sets to measure its ability to discriminate native structure from decoys. This potential has been able to distinguish native structures from the decoys in the most cases. Further, the calculated Z-scores were quite high for all protein datasets. Conclusions This knowledge-based potential of mean force can be used in protein structure prediction, fold recognition, comparative modelling and molecular recognition. The program is available at http://www.bioinf.cs.ipm.ac.ir/softwares/surfield</p
- …