39,796 research outputs found

    Predicting the accuracy of protein-ligand docking on homology models

    Get PDF
    Ligand-protein docking is increasingly used in Drug Discovery. The initial limitations imposed by a reduced availability of target protein structures have been overcome by the use of theoretical models, especially those derived by homology modeling techniques. While this greatly extended the use of docking simulations, it also introduced the need for general and robust criteria to estimate the reliability of docking results given the model quality. To this end, a large-scale experiment was performed on a diverse set including experimental structures and homology models for a group of representative ligand-protein complexes. A wide spectrum of model quality was sampled using templates at different evolutionary distances and different strategies for target-template alignment and modeling. The obtained models were scored by a selection of the most used model quality indices. The binding geometries were generated using AutoDock, one of the most common docking programs. An important result of this study is that indeed quantitative and robust correlations exist between the accuracy of docking results and the model quality, especially in the binding site. Moreover, state-of-the-art indices for model quality assessment are already an effective tool for an a priori prediction of the accuracy of docking experiments in the context of groups of proteins with conserved structural characteristics.Contract/grant sponsor: National Institutes of Health; contract/grant numbers: ES00768

    From Nonspecific DNA–Protein Encounter Complexes to the Prediction of DNA–Protein Interactions

    Get PDF
    ©2009 Gao, Skolnick. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.doi:10.1371/journal.pcbi.1000341DNA–protein interactions are involved in many essential biological activities. Because there is no simple mapping code between DNA base pairs and protein amino acids, the prediction of DNA–protein interactions is a challenging problem. Here, we present a novel computational approach for predicting DNA-binding protein residues and DNA–protein interaction modes without knowing its specific DNA target sequence. Given the structure of a DNA-binding protein, the method first generates an ensemble of complex structures obtained by rigid-body docking with a nonspecific canonical B-DNA. Representative models are subsequently selected through clustering and ranking by their DNA–protein interfacial energy. Analysis of these encounter complex models suggests that the recognition sites for specific DNA binding are usually favorable interaction sites for the nonspecific DNA probe and that nonspecific DNA–protein interaction modes exhibit some similarity to specific DNA–protein binding modes. Although the method requires as input the knowledge that the protein binds DNA, in benchmark tests, it achieves better performance in identifying DNA-binding sites than three previously established methods, which are based on sophisticated machine-learning techniques. We further apply our method to protein structures predicted through modeling and demonstrate that our method performs satisfactorily on protein models whose root-mean-square Ca deviation from native is up to 5 Å from their native structures. This study provides valuable structural insights into how a specific DNA-binding protein interacts with a nonspecific DNA sequence. The similarity between the specific DNA–protein interaction mode and nonspecific interaction modes may reflect an important sampling step in search of its specific DNA targets by a DNA-binding protein

    Encounter complexes and dimensionality reduction in protein-protein association

    Get PDF
    An outstanding challenge has been to understand the mechanism whereby proteins associate. We report here the results of exhaustively sampling the conformational space in protein–protein association using a physics-based energy function. The agreement between experimental intermolecular paramagnetic relaxation enhancement (PRE) data and the PRE profiles calculated from the docked structures shows that the method captures both specific and non-specific encounter complexes. To explore the energy landscape in the vicinity of the native structure, the nonlinear manifold describing the relative orientation of two solid bodies is projected onto a Euclidean space in which the shape of low energy regions is studied by principal component analysis. Results show that the energy surface is canyon-like, with a smooth funnel within a two dimensional subspace capturing over 75% of the total motion. Thus, proteins tend to associate along preferred pathways, similar to sliding of a protein along DNA in the process of protein-DNA recognition

    Consensus virtual screening approaches to predict protein ligands

    Get PDF
    In order to exploit the advantages of receptor-based virtual screening, namely time/cost saving and specificity, it is important to rely on algorithms that predict a high number of active ligands at the top ranks of a small molecule database. Towards that goal consensus methods combining the results of several docking algorithms were developed and compared against the individual algorithms. Furthermore, a recently proposed rescoring method based on drug efficiency indices was evaluated. Among AutoDock Vina 1.0, AutoDock 4.2 and GemDock, AutoDock Vina was the best performing single method in predicting high affinity ligands from a database of known ligands and decoys. The rescoring of predicted binding energies with the water/butanol partition coeffcient did not lead to an improvement averaged over all receptor targets. Various consensus algorithms were investigated and a simple combination of AutoDock and AutoDock Vina results gave the most consistent performance that showed early enrichment of known ligands for all receptor targets investigated. In case a number ligands is known for a specific target, every method proposed in this study should be evaluated

    GAPDOCK: A genetic algorithm approach to protein docking in CAPRI round 1

    Get PDF
    As part of the first Critical Assessment of PRotein Interactions, round 1, we predict the structure of two protein-protein complexes, by using a genetic algorithm, GAPDOCK, in combination with surface complementarity, buried surface area, biochemical information, and human intervention. Among the five models submitted for target 1, HPr phosphocarrier protein (B. subtilis) and the hexameric HPr kinase (L. lactis), the best correctly predicts 17 of 52 interprotein contacts, whereas for target 2, bovine rotavirus VP6 protein-monoclonal antibody, the best model predicts 27 of 52 correct contacts. Given the difficult nature of the targets, these predictions are very encouraging and compare well with those obtained by other methods. Nevertheless, it is clear that there is a need for improved methods for distinguishing between correct and plausible but incorrect complexes. Proteins 2003;52:10-14

    Evaluation of QSAR and ligand enzyme docking for the identification of ABCB1 substrates

    Get PDF
    P-glycoprotein (P-gp) is an efflux pump that belongs to ATP-binding cassette (ABC) transporter family embedded in the membrane bilayer. P-gp is a polyspecific protein that has demonstrated its function as a transporter of hydrophobic drugs as well as transporting lipids, steroids and metabolic products. Its role in multidrug resistance (MDR) and pharmacokinetic profile of clinically important drug molecules has been widely recognised. In this study, QSAR and enzyme-ligand docking methods were explored in order to classify substrates and non-substrates of P-glycoprotein. A set of 123 compounds designated as substrates (54) or non-substrates (69) by Matsson et al., 2009 was used for the investigation. For QSAR studies, molecular descriptors were calculated using ACD labs/LogD Suite and MOE (CCG Inc.). P-glycoprotein structures available in the Protein data bank were used for docking studies and determination of binding scores using MOE software. Binding sites were defined using co-crystallised ligand structures. Three classification algorithms which included classification and regression trees, boosted trees and support vector machine were examined. Models were developed using a training set of 98 compounds and were validated using the remaining compounds as the external test set. A model generated using BT was identified as the best of three models, with a prediction accuracy of 88%, Mathews correlation coefficient of 0.77 and Youden’s J index of 0.80 for the test set. Inclusion of various docking scores for different binding sites improved the models only marginally

    Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening

    Full text link
    This work introduces a number of algebraic topology approaches, such as multicomponent persistent homology, multi-level persistent homology and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. Multicomponent persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for chemical and biological problems. Extensive numerical experiments involving more than 4,000 protein-ligand complexes from the PDBBind database and near 100,000 ligands and decoys in the DUD database are performed to test respectively the scoring power and the virtual screening power of the proposed topological approaches. It is demonstrated that the present approaches outperform the modern machine learning based methods in protein-ligand binding affinity predictions and ligand-decoy discrimination
    • …
    corecore