28 research outputs found

    DECK: Distance and environment-dependent, coarse-grained, knowledge-based potentials for protein-protein docking

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Computational approaches to protein-protein docking typically include scoring aimed at improving the rank of the near-native structure relative to the false-positive matches. Knowledge-based potentials improve modeling of protein complexes by taking advantage of the rapidly increasing amount of experimentally derived information on protein-protein association. An essential element of knowledge-based potentials is defining the reference state for an optimal description of the residue-residue (or atom-atom) pairs in the non-interaction state.</p> <p>Results</p> <p>The study presents a new Distance- and Environment-dependent, Coarse-grained, Knowledge-based (DECK) potential for scoring of protein-protein docking predictions. Training sets of protein-protein matches were generated based on bound and unbound forms of proteins taken from the D<smcaps>OCKGROUND</smcaps> resource. Each residue was represented by a pseudo-atom in the geometric center of the side chain. To capture the long-range and the multi-body interactions, residues in different secondary structure elements at protein-protein interfaces were considered as different residue types. Five reference states for the potentials were defined and tested. The optimal reference state was selected and the cutoff effect on the distance-dependent potentials investigated. The potentials were validated on the docking decoys sets, showing better performance than the existing potentials used in scoring of protein-protein docking results.</p> <p>Conclusions</p> <p>A novel residue-based statistical potential for protein-protein docking was developed and validated on docking decoy sets. The results show that the scoring function DECK can successfully identify near-native protein-protein matches and thus is useful in protein docking. In addition to the practical application of the potentials, the study provides insights into the relative utility of the reference states, the scope of the distance dependence, and the coarse-graining of the potentials.</p

    DECK: Distance and environment-dependent, coarse-grained, knowledge-based potentials for protein-protein docking

    Get PDF
    Background: Computational approaches to protein-protein docking typically include scoring aimed at improving the rank of the near-native structure relative to the false-positive matches. Knowledge-based potentials improve modeling of protein complexes by taking advantage of the rapidly increasing amount of experimentally derived information on protein-protein association. An essential element of knowledge-based potentials is defining the reference state for an optimal description of the residue-residue (or atom-atom) pairs in the non-interaction state. Results: The study presents a new Distance- and Environment-dependent, Coarse-grained, Knowledge-based (DECK) potential for scoring of protein-protein docking predictions. Training sets of protein-protein matches were generated based on bound and unbound forms of proteins taken from the DOCKGROUND resource. Each residue was represented by a pseudo-atom in the geometric center of the side chain. To capture the long-range and the multi-body interactions, residues in different secondary structure elements at protein-protein interfaces were considered as different residue types. Five reference states for the potentials were defined and tested. The optimal reference state was selected and the cutoff effect on the distance-dependent potentials investigated. The potentials were validated on the docking decoys sets, showing better performance than the existing potentials used in scoring of protein-protein docking results. Conclusions: A novel residue-based statistical potential for protein-protein docking was developed and validated on docking decoy sets. The results show that the scoring function DECK can successfully identify near-native protein-protein matches and thus is useful in protein docking. In addition to the practical application of the potentials, the study provides insights into the relative utility of the reference states, the scope of the distance dependence, and the coarse-graining of the potentials

    Classification and Scoring of Protein Complexes

    Get PDF
    Proteins interactions mediate all biological systems in a cell; understanding their interactions means understanding the processes responsible for human life. Their structure can be obtained experimentally, but such processes frequently fail at determining structures of protein complexes. To address the issue, computational methods have been developed that attempt to predict the structure of a protein complex, using information of its constituents. These methods, known as docking, generate thousands of possible poses for each complex, and require effective and reliable ways to quickly discriminate the correct pose among the set of incorrect ones. In this thesis, a new scoring function was developed that uses machine learning techniques and features extracted from the structure of the interacting proteins, to correctly classify and rank the putative poses. The developed function has shown to be competitive with current state-of-the-art solutions

    SNAPP, CRACLe, PoPP: Predicting Protein Interactions

    Get PDF
    Protein-Protein Interactions (PPIs) play a central role in all major signaling events that occur in living cells, from DNA replication to complex, post-translational protein-signaling systems. However, many if not most pairs of interacting proteins remain unknown, and the ability to identify and predict protein-protein interaction sites is a key component in systems and structural biology. Computational techniques such as MD simulations and homology- or template-based modeling constitute the main bioinformatics methods applied to study PPIs, and despite many recent developments, fast and reliable predictions of PPI sites remains a challenge. Using computational geometry, we have developed two novel, geometry-based scoring function called Simplicial Neighborhood Analysis of Protein Packing (SNAPP) for the task of analyzing and predicting protein interactions. SNAPP-Surface calculates the likelihood that an amino acid on the surface of a protein will participate in a protein interaction. SNAPP-Surface is used in our novel algorithm and software for predicting protein-protein and protein-peptide binding sites called Critical Residue Analysis and Complementarity Likelihood (CRACLe). CRACLe was designed for accurate and efficient high-throughput screening of individual proteins for potential binding sites. CRACLe can be effectively applied to identify putative binding sites for novel proteins and potentially for building protein-protein networks. SNAPP-Interface is used in our novel protein-peptide docking algorithm called Prediction of Protein-peptide Packing (PoPP) to evaluate protein-peptide interactions. SNAPP-Interface is also useful for discriminating between native-like and decoy protein-protein interactions. The SNAPP, CRACLe, and PoPP software and all curated protein-protein and protein-peptide datasets are freely available at http://chembench.mml.unc.edu/cracle.Doctor of Philosoph

    Side-Chain Conformational Changes upon Protein-Protein Association

    Get PDF
    Conformational changes upon protein-protein association are the key element of the binding mechanism. The study presents a systematic large-scale analysis of such conformational changes in the side chains. The results indicate that short and long side chains have different propensities for the conformational changes. Long side chains with three or more dihedral angles are often subject to large conformational transition. Shorter residues with one or two dihedral angles typically undergo local conformational changes not leading to a conformational transition. The relationship between the local readjustments and the equilibrium fluctuations of a side chain around its unbound conformation is suggested. Most of the side chains undergo larger changes in the dihedral angle most distant from the backbone. The frequencies of the core-to-surface interface transitions of six nonpolar residues and Tyr are larger than the frequencies of the opposite, surface-to-core transitions. The binding increases both polar and nonpolar interface areas. However, the increase of the nonpolar area is larger for all considered classes of protein complexes, suggesting that the protein association perturbs the unbound interfaces to increase the hydrophobic contribution to the binding free energy. To test modeling approaches to side-chain flexibility in protein docking, conformational changes in the X-ray set were compared with those in the docking decoys sets. The results lead to a better understanding of the conformational changes in proteins and suggest directions for efficient conformational sampling in docking protocols

    Text Mining for Protein Docking

    Get PDF
    The rapidly growing amount of publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for predictive biomolecular modeling. The accumulated data on experimentally determined structures transformed structure prediction of proteins and protein complexes. Instead of exploring the enormous search space, predictive tools can simply proceed to the solution based on similarity to the existing, previously determined structures. A similar major paradigm shift is emerging due to the rapidly expanding amount of information, other than experimentally determined structures, which still can be used as constraints in biomolecular structure prediction. Automated text mining has been widely used in recreating protein interaction networks, as well as in detecting small ligand binding sites on protein structures. Combining and expanding these two well-developed areas of research, we applied the text mining to structural modeling of protein-protein complexes (protein docking). Protein docking can be significantly improved when constraints on the docking mode are available. We developed a procedure that retrieves published abstracts on a specific protein-protein interaction and extracts information relevant to docking. The procedure was assessed on protein complexes from Dockground (http://dockground.compbio.ku.edu). The results show that correct information on binding residues can be extracted for about half of the complexes. The amount of irrelevant information was reduced by conceptual analysis of a subset of the retrieved abstracts, based on the bag-of-words (features) approach. Support Vector Machine models were trained and validated on the subset. The remaining abstracts were filtered by the best-performing models, which decreased the irrelevant information for ~ 25% complexes in the dataset. The extracted constraints were incorporated in the docking protocol and tested on the Dockground unbound benchmark set, significantly increasing the docking success rate

    Structural neighboring property for identifying protein-protein binding sites

    Get PDF

    Scoring protein interaction decoys using exposed residues (SPIDER): A novel multibody interaction scoring function based on frequent geometric patterns of interfacial residues

    Get PDF
    Accurate prediction of the structure of protein-protein complexes in computational docking experiments remains a formidable challenge. It has been recognized that identifying native or native-like poses among multiple decoys is the major bottleneck of the current scoring functions used in docking. We have developed a novel multi-body pose-scoring function that has no theoretical limit on the number of residues contributing to the individual interaction terms. We use a coarse-grain representation of a protein-protein complex where each residue is represented by its side chain centroid. We apply a computational geometry approach called Almost-Delaunay tessellation that transforms protein-protein complexes into a residue contact network, or an un-directional graph where vertex-residues are nodes connected by edges. This treatment forms a family of interfacial graphs representing a dataset of protein-protein complexes. We then employ frequent subgraph mining approach to identify common interfacial residue patterns that appear in at least a subset of native protein-protein interfaces. The geometrical parameters and frequency of occurrence of each “native” pattern in the training set are used to develop the new SPIDER scoring function. SPIDER was validated using standard “ZDOCK” benchmark dataset that was not used in the development of SPIDER. We demonstrate that SPIDER scoring function ranks native and native-like poses above geometrical decoys and that it exceeds in performance a popular ZRANK scoring function. SPIDER was ranked among the top scoring functions in a recent round of CAPRI (Critical Assessment of PRedicted Interactions) blind test of protein–protein docking methods

    In silico study of protein-protein interactions

    Get PDF
    2011 - 2012Protein-protein interactions are at the basis of many of the most important molecular processes in the cell, which explains the constantly growing interest within the scientific community for the structural characterization of protein complexes.1 However, experimental knowledge of the 3D structure of the great majority of such complexes is missing, and this spurred their accurate prediction through molecular docking simulations, one of the major challenges in the field of structural computational biology and bioinformatics.2,3 My PhD work aims to contribute to the field, by providing novel computational instruments and giving useful insight on specific case studies in the field. In particular, in the first part of my PhD thesis, I present novel methods I developed: i) for analysing and comparing the 3D structure of protein complexes, to immediately extract useful information on the interaction based on a contact map visualization (COCOMAPS4 web tool, Chapter 2), and ii) for analysing a set of multiple docking solutions, to single out the key inter-residue contacts and to distinguish native-like solutions from the incorrect ones (CONS-COCOMAPS5 web tool and CONS-RANK program, Chapter 3 and 4, respectively). In the second part of the thesis, these methods have been applied, in combination with classical state-of-art computational biology techniques, to predict and analyse the binding mode in real biological systems, related to particular diseases. This part of the work has been afforded in collaboration with experimental groups, to take advantage of specific biological information on the systems under study. In particular, the interaction between proteins involved in the autoimmune response in celiac disease6,7 (Chapters 5 and 6) has been studied in collaboration with the group directed by Prof. Sblattero, University of Piemonte Orientale (Italy) and the group directed by Prof. Esposito, University of Salerno (Italy). In addition, recognition properties of 3 the FXa enzymatic system8 has been studied through dynamic characterization of a FXa pathogenic mutant that causes problems in the blood coagulation cascade (Chapter 7). This study has been performed in collaboration with the group directed by Prof. De Cristofaro, Catholic University School of Medicine, Rome (Italy) and the group directed by Prof. Peyvandi, Ospedale Maggiore Policlinico and Università degli Studi di Milano (Italy)... [edited by author]XI n.s

    Quality assessment of docked protein interfaces using 3D convolution

    Get PDF
    2021 Spring.Includes bibliographical references.Proteins play a vital role in most biological processes, most of which occur through interactions between proteins. When proteins interact they form a complex, whose functionality is different from the individual proteins in the complex. Therefore understanding protein interactions and their interfaces is an important problem. Experimental methods for this task are expensive and time consuming, which has led to the development of docking methods for predicting the structures of protein complexes. These methods produce a large number of potential solutions, and the energy functions used in these methods are not good enough to find solutions that are close to the native state of the complex. Deep learning and its ability to model complex problems has opened up the opportunity to model protein complexes and learn from scratch how to rank docking solutions. As a part of this work, we have developed a 3D convolutional network approach that uses raw atomic densities to address this problem. Our method achieves performance which is on par with state-of-art methods. We have evaluated our model on docked protein structures simulated from four docking tools namely ZDOCK, HADDOCK, FRODOCK and ClusPro on targets from Docking Benchmark Data version 5 (DBD5)
    corecore