88 research outputs found

    Development and testing of new force fields for molecular dynamics simulations

    Get PDF
    Recent progress in modeling of protein folding in Dr. Shaw laboratory has been achieved only after some improvements of potentials of covalent forces, taken from the standard AMBER force field; and still, the force field used is not quite satisfactory to reproduce folded structures of some larger proteins, having significant, about 5A, RMS deviation between the computed and experimentally determined 3D structures. The objective of this research is to develop and test new polarizable atomic force fields (FFs) for "in-vacuum" and "in-water" non-bonded interactions based on AMBER ff99SBILDN force fields, improved by inclusion of new terms. FFs parameter optimization will be done using our set of molecular crystals with crystallographic data from the Cambridge Structural Database and sublimation/solvation thermodynamics characteristics from various sources

    Exact Solution of the Munoz-Eaton Model for Protein Folding

    Full text link
    A transfer-matrix formalism is introduced to evaluate exactly the partition function of the Munoz-Eaton model, relating the folding kinetics of proteins of known structure to their thermodynamics and topology. This technique can be used for a generic protein, for any choice of the energy and entropy parameters, and in principle allows the model to be used as a first tool to characterize the dynamics of a protein of known native state and equilibrium population. Applications to a β\beta-hairpin and to protein CI-2, with comparisons to previous results, are also shown.Comment: 4 pages, 5 figures, RevTeX 4. To be published in Phys. Rev. Let

    Nucleation phenomena in protein folding: The modulating role of protein sequence

    Full text link
    For the vast majority of naturally occurring, small, single domain proteins folding is often described as a two-state process that lacks detectable intermediates. This observation has often been rationalized on the basis of a nucleation mechanism for protein folding whose basic premise is the idea that after completion of a specific set of contacts forming the so-called folding nucleus the native state is achieved promptly. Here we propose a methodology to identify folding nuclei in small lattice polymers and apply it to the study of protein molecules with chain length N=48. To investigate the extent to which protein topology is a robust determinant of the nucleation mechanism we compare the nucleation scenario of a native-centric model with that of a sequence specific model sharing the same native fold. To evaluate the impact of the sequence's finner details in the nucleation mechanism we consider the folding of two non- homologous sequences. We conclude that in a sequence-specific model the folding nucleus is, to some extent, formed by the most stable contacts in the protein and that the less stable linkages in the folding nucleus are solely determined by the fold's topology. We have also found that independently of protein sequence the folding nucleus performs the same `topological' function. This unifying feature of the nucleation mechanism results from the residues forming the folding nucleus being distributed along the protein chain in a similar and well-defined manner that is determined by the fold's topological features.Comment: 10 Figures. J. Physics: Condensed Matter (to appear

    Prediction of Amyloidogenic and Disordered Regions in Protein Chains

    Get PDF
    The determination of factors that influence protein conformational changes is very important for the identification of potentially amyloidogenic and disordered regions in polypeptide chains. In our work we introduce a new parameter, mean packing density, to detect both amyloidogenic and disordered regions in a protein sequence. It has been shown that regions with strong expected packing density are responsible for amyloid formation. Our predictions are consistent with known disease-related amyloidogenic regions for eight of 12 amyloid-forming proteins and peptides in which the positions of amyloidogenic regions have been revealed experimentally. Our findings support the concept that the mechanism of amyloid fibril formation is similar for different peptides and proteins. Moreover, we have demonstrated that regions with weak expected packing density are responsible for the appearance of disordered regions. Our method has been tested on datasets of globular proteins and long disordered protein segments, and it shows improved performance over other widely used methods. Thus, we demonstrate that the expected packing density is a useful value with which one can predict both intrinsically disordered and amyloidogenic regions of a protein based on sequence alone. Our results are important for understanding the structural characteristics of protein folding and misfolding

    Conformations of Proteins in Equilibrium

    Full text link
    We introduce a simple theoretical approach for an equilibrium study of proteins with known native state structures. We test our approach with results on well-studied globular proteins, Chymotrypsin Inhibitor (2ci2), Barnase and the alpha spectrin SH3 domain and present evidence for a hierarchical onset of order on lowering the temperature with significant organization at the local level even at high temperatures. A further application to the folding process of HIV-1 protease shows that the model can be reliably used to identify key folding sites that are responsible for the development of drug resistance .Comment: 6 pages, 3 eps figure

    ComSin: database of protein structures in bound (complex) and unbound (single) states in relation to their intrinsic disorder

    Get PDF
    Most of the proteins in a cell assemble into complexes to carry out their function. In this work, we have created a new database (named ComSin) of protein structures in bound (complex) and unbound (single) states to provide a researcher with exhaustive information on structures of the same or homologous proteins in bound and unbound states. From the complete Protein Data Bank (PDB), we selected 24 910 pairs of protein structures in bound and unbound states, and identified regions of intrinsic disorder. For 2448 pairs, the proteins in bound and unbound states are identical, while 7129 pairs have sequence identity 90% or larger. The developed server enables one to search for proteins in bound and unbound states with several options including sequence similarity between the corresponding proteins in bound and unbound states, and validation of interaction interfaces of protein complexes. Besides that, through our web server, one can obtain necessary information for studying disorder-to-order and order-to-disorder transitions upon complex formation, and analyze structural differences between proteins in bound and unbound states. The database is available at http://antares.protres.ru/comsin/

    Using simple artificial intelligence methods for predicting amyloidogenesis in antibodies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>All polypeptide backbones have the potential to form amyloid fibrils, which are associated with a number of degenerative disorders. However, the likelihood that amyloidosis would actually occur under physiological conditions depends largely on the amino acid composition of a protein. We explore using a naive Bayesian classifier and a weighted decision tree for predicting the amyloidogenicity of immunoglobulin sequences.</p> <p>Results</p> <p>The average accuracy based on leave-one-out (LOO) cross validation of a Bayesian classifier generated from 143 amyloidogenic sequences is 60.84%. This is consistent with the average accuracy of 61.15% for a holdout test set comprised of 103 AM and 28 non-amyloidogenic sequences. The LOO cross validation accuracy increases to 81.08% when the training set is augmented by the holdout test set. In comparison, the average classification accuracy for the holdout test set obtained using a decision tree is 78.64%. Non-amyloidogenic sequences are predicted with average LOO cross validation accuracies between 74.05% and 77.24% using the Bayesian classifier, depending on the training set size. The accuracy for the holdout test set was 89%. For the decision tree, the non-amyloidogenic prediction accuracy is 75.00%.</p> <p>Conclusions</p> <p>This exploratory study indicates that both classification methods may be promising in providing straightforward predictions on the amyloidogenicity of a sequence. Nevertheless, the number of available sequences that satisfy the premises of this study are limited, and are consequently smaller than the ideal training set size. Increasing the size of the training set clearly increases the accuracy, and the expansion of the training set to include not only more derivatives, but more alignments, would make the method more sound. The accuracy of the classifiers may also be improved when additional factors, such as structural and physico-chemical data, are considered. The development of this type of classifier has significant applications in evaluating engineered antibodies, and may be adapted for evaluating engineered proteins in general.</p

    Exploring the Universe of Protein Structures beyond the Protein Data Bank

    Get PDF
    It is currently believed that the atlas of existing protein structures is faithfully represented in the Protein Data Bank. However, whether this atlas covers the full universe of all possible protein structures is still a highly debated issue. By using a sophisticated numerical approach, we performed an exhaustive exploration of the conformational space of a 60 amino acid polypeptide chain described with an accurate all-atom interaction potential. We generated a database of around 30,000 compact folds with at least of secondary structure corresponding to local minima of the potential energy. This ensemble plausibly represents the universe of protein folds of similar length; indeed, all the known folds are represented in the set with good accuracy. However, we discover that the known folds form a rather small subset, which cannot be reproduced by choosing random structures in the database. Rather, natural and possible folds differ by the contact order, on average significantly smaller in the former. This suggests the presence of an evolutionary bias, possibly related to kinetic accessibility, towards structures with shorter loops between contacting residues. Beside their conceptual relevance, the new structures open a range of practical applications such as the development of accurate structure prediction strategies, the optimization of force fields, and the identification and design of novel folds

    Amyloidogenic Regions and Interaction Surfaces Overlap in Globular Proteins Related to Conformational Diseases

    Get PDF
    Protein aggregation underlies a wide range of human disorders. The polypeptides involved in these pathologies might be intrinsically unstructured or display a defined 3D-structure. Little is known about how globular proteins aggregate into toxic assemblies under physiological conditions, where they display an initially folded conformation. Protein aggregation is, however, always initiated by the establishment of anomalous protein-protein interactions. Therefore, in the present work, we have explored the extent to which protein interaction surfaces and aggregation-prone regions overlap in globular proteins associated with conformational diseases. Computational analysis of the native complexes formed by these proteins shows that aggregation-prone regions do frequently overlap with protein interfaces. The spatial coincidence of interaction sites and aggregating regions suggests that the formation of functional complexes and the aggregation of their individual subunits might compete in the cell. Accordingly, single mutations affecting complex interface or stability usually result in the formation of toxic aggregates. It is suggested that the stabilization of existing interfaces in multimeric proteins or the formation of new complexes in monomeric polypeptides might become effective strategies to prevent disease-linked aggregation of globular proteins
    corecore