151 research outputs found

    Development and testing of new force fields for molecular dynamics simulations

    Get PDF
    Recent progress in modeling of protein folding in Dr. Shaw laboratory has been achieved only after some improvements of potentials of covalent forces, taken from the standard AMBER force field; and still, the force field used is not quite satisfactory to reproduce folded structures of some larger proteins, having significant, about 5A, RMS deviation between the computed and experimentally determined 3D structures. The objective of this research is to develop and test new polarizable atomic force fields (FFs) for "in-vacuum" and "in-water" non-bonded interactions based on AMBER ff99SBILDN force fields, improved by inclusion of new terms. FFs parameter optimization will be done using our set of molecular crystals with crystallographic data from the Cambridge Structural Database and sublimation/solvation thermodynamics characteristics from various sources

    Abundance of intrinsic disorder in SV-IV, a multifunctional androgen-dependent protein secreted from rat seminal vesicle

    Get PDF
    The potent immunomodulatory, anti-inflammatory and procoagulant properties of the
protein no. 4 secreted from the rat seminal vesicle epithelium (SV-IV) have been
previously found to be modulated by a supramolecular monomer-trimer equilibrium.
More structural details that integrate experimental data into a predictive framework
have recently been reported. Unfortunately, homology modelling and fold-recognition
strategies were not successful in creating a theoretical model of the structural
organization of SV-IV. It was inferred that the global structure of SV-IV is not similar
to any protein of known three-dimensional structure. Reversing the classical approach
to the sequence-structure-function paradigm, in this paper we report on novel
information obtained by comparing physicochemical parameters of SV-IV with two
datasets made of intrinsically unfolded and ideally globular proteins. In addition, we
have analysed the SV-IV sequence by several publicly available disorder-oriented
predictors. Overall, disorder predictions and a re-examination of existing experimental
data strongly suggest that SV-IV needs large plasticity to efficiently interact with the
different targets that characterize its multifaceted biological function and should be
therefore better classified as an intrinsically disordered protein

    KineticDB: a database of protein folding kinetics

    Get PDF
    We propose here KineticDB, a systematically compiled database of protein folding kinetics, which contains about 90 unique proteins. The main goal of the KineticDB is to provide users with a diverse set of protein folding rates determined experimentally. The search for determinants of protein folding is still in progress, aimed at obtaining a new understanding of the folding process. Comparison with experimental protein folding rates has been the main tool for validation of both theoretical models and empirical relationships during the last 10 years. It is, therefore, necessary to provide a researcher with as much data as possible in a simple and easy-to-use way. At present, the KineticDB contains the results of folding kinetics measurements of single-domain proteins and separate protein domains as well as short peptides without disulfide bonds. It includes data on about 90 unique proteins and many mutants that have been systematically accumulated over the last 10 years and is the largest collection of protein folding kinetic data presented as a database. The KineticDB is available at http://kineticdb.protres.ru/db/index.pl

    Prediction of Amyloidogenic and Disordered Regions in Protein Chains

    Get PDF
    The determination of factors that influence protein conformational changes is very important for the identification of potentially amyloidogenic and disordered regions in polypeptide chains. In our work we introduce a new parameter, mean packing density, to detect both amyloidogenic and disordered regions in a protein sequence. It has been shown that regions with strong expected packing density are responsible for amyloid formation. Our predictions are consistent with known disease-related amyloidogenic regions for eight of 12 amyloid-forming proteins and peptides in which the positions of amyloidogenic regions have been revealed experimentally. Our findings support the concept that the mechanism of amyloid fibril formation is similar for different peptides and proteins. Moreover, we have demonstrated that regions with weak expected packing density are responsible for the appearance of disordered regions. Our method has been tested on datasets of globular proteins and long disordered protein segments, and it shows improved performance over other widely used methods. Thus, we demonstrate that the expected packing density is a useful value with which one can predict both intrinsically disordered and amyloidogenic regions of a protein based on sequence alone. Our results are important for understanding the structural characteristics of protein folding and misfolding

    Prediction of peptide and protein propensity for amyloid formation

    Get PDF
    Understanding which peptides and proteins have the potential to undergo amyloid formation and what driving forces are responsible for amyloid-like fiber formation and stabilization remains limited. This is mainly because proteins that can undergo structural changes, which lead to amyloid formation, are quite diverse and share no obvious sequence or structural homology, despite the structural similarity found in the fibrils. To address these issues, a novel approach based on recursive feature selection and feed-forward neural networks was undertaken to identify key features highly correlated with the self-assembly problem. This approach allowed the identification of seven physicochemical and biochemical properties of the amino acids highly associated with the self-assembly of peptides and proteins into amyloid-like fibrils (normalized frequency of β-sheet, normalized frequency of β-sheet from LG, weights for β-sheet at the window position of 1, isoelectric point, atom-based hydrophobic moment, helix termination parameter at position j+1 and ΔGº values for peptides extrapolated in 0 M urea). Moreover, these features enabled the development of a new predictor (available at http://cran.r-project.org/web/packages/appnn/index.html) capable of accurately and reliably predicting the amyloidogenic propensity from the polypeptide sequence alone with a prediction accuracy of 84.9 % against an external validation dataset of sequences with experimental in vitro, evidence of amyloid formation

    On Side-Chain Conformational Entropy of Proteins

    Get PDF
    The role of side-chain entropy (SCE) in protein folding has long been speculated about but is still not fully understood. Utilizing a newly developed Monte Carlo method, we conducted a systematic investigation of how the SCE relates to the size of the protein and how it differs among a protein's X-ray, NMR, and decoy structures. We estimated the SCE for a set of 675 nonhomologous proteins, and observed that there is a significant SCE for both exposed and buried residues for all these proteins—the contribution of buried residues approaches ∼40% of the overall SCE. Furthermore, the SCE can be quite different for structures with similar compactness or even similar conformations. As a striking example, we found that proteins' X-ray structures appear to pack more “cleverly” than their NMR or decoy counterparts in the sense of retaining higher SCE while achieving comparable compactness, which suggests that the SCE plays an important role in favouring native protein structures. By including a SCE term in a simple free energy function, we can significantly improve the discrimination of native protein structures from decoys

    ComSin: database of protein structures in bound (complex) and unbound (single) states in relation to their intrinsic disorder

    Get PDF
    Most of the proteins in a cell assemble into complexes to carry out their function. In this work, we have created a new database (named ComSin) of protein structures in bound (complex) and unbound (single) states to provide a researcher with exhaustive information on structures of the same or homologous proteins in bound and unbound states. From the complete Protein Data Bank (PDB), we selected 24 910 pairs of protein structures in bound and unbound states, and identified regions of intrinsic disorder. For 2448 pairs, the proteins in bound and unbound states are identical, while 7129 pairs have sequence identity 90% or larger. The developed server enables one to search for proteins in bound and unbound states with several options including sequence similarity between the corresponding proteins in bound and unbound states, and validation of interaction interfaces of protein complexes. Besides that, through our web server, one can obtain necessary information for studying disorder-to-order and order-to-disorder transitions upon complex formation, and analyze structural differences between proteins in bound and unbound states. The database is available at http://antares.protres.ru/comsin/

    IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding

    Get PDF
    The structural states of proteins include ordered globular domains as well as intrinsically disordered protein regions that exist as highly flexible conformational ensembles in isolation. Various computational tools have been developed to discriminate ordered and disordered segments based on the amino acid sequence. However, properties of IDRs can also depend on various conditions, including binding to globular protein partners or environmental factors, such as redox potential. These cases provide further challenges for the computational characterization of disordered segments. In this work we present IUPred2A, a combined web interface that allows to generate energy estimation based predictions for ordered and disordered residues by IUPred2 and for disordered binding regions by ANCHOR2. The updated web server retains the robustness of the original programs but offers several new features. While only minor bug fixes are implemented for IUPred, the next version of ANCHOR is significantly improved through a new architecture and parameters optimized on novel datasets. In addition, redox-sensitive regions can also be highlighted through a novel experimental feature

    Predicting mostly disordered proteins by using structure-unknown protein data

    Get PDF
    BACKGROUND: Predicting intrinsically disordered proteins is important in structural biology because they are thought to carry out various cellular functions even though they have no stable three-dimensional structure. We know the structures of far more ordered proteins than disordered proteins. The structural distribution of proteins in nature can therefore be inferred to differ from that of proteins whose structures have been determined experimentally. We know many more protein sequences than we do protein structures, and many of the known sequences can be expected to be those of disordered proteins. Thus it would be efficient to use the information of structure-unknown proteins in order to avoid training data sparseness. We propose a novel method for predicting which proteins are mostly disordered by using spectral graph transducer and training with a huge amount of structure-unknown sequences as well as structure-known sequences. RESULTS: When the proposed method was evaluated on data that included 82 disordered proteins and 526 ordered proteins, its sensitivity was 0.723 and its specificity was 0.977. It resulted in a Matthews correlation coefficient 0.202 points higher than that obtained using FoldIndex, 0.221 points higher than that obtained using the method based on plotting hydrophobicity against the number of contacts and 0.07 points higher than that obtained using support vector machines (SVMs). To examine robustness against training data sparseness, we investigated the correlation between two results obtained when the method was trained on different datasets and tested on the same dataset. The correlation coefficient for the proposed method is 0.14 higher than that for the method using SVMs. When the proposed SGT-based method was compared with four per-residue predictors (VL3, GlobPlot, DISOPRED2 and IUPred (long)), its sensitivity was 0.834 for disordered proteins, which is 0.052–0.523 higher than that of the per-residue predictors, and its specificity was 0.991 for ordered proteins, which is 0.036–0.153 higher than that of the per-residue predictors. The proposed method was also evaluated on data that included 417 partially disordered proteins. It predicted the frequency of disordered proteins to be 1.95% for the proteins with 5%–10% disordered sequences, 1.46% for the proteins with 10%–20% disordered sequences and 16.57% for proteins with 20%–40% disordered sequences. CONCLUSION: The proposed method, which utilizes the information of structure-unknown data, predicts disordered proteins more accurately than other methods and is less affected by training data sparseness
    corecore