1,228 research outputs found

    SnugDock: Paratope Structural Optimization during Antibody-Antigen Docking Compensates for Errors in Antibody Homology Models

    Get PDF
    High resolution structures of antibody-antigen complexes are useful for analyzing the binding interface and to make rational choices for antibody engineering. When a crystallographic structure of a complex is unavailable, the structure must be predicted using computational tools. In this work, we illustrate a novel approach, named SnugDock, to predict high-resolution antibody-antigen complex structures by simultaneously structurally optimizing the antibody-antigen rigid-body positions, the relative orientation of the antibody light and heavy chains, and the conformations of the six complementarity determining region loops. This approach is especially useful when the crystal structure of the antibody is not available, requiring allowances for inaccuracies in an antibody homology model which would otherwise frustrate rigid-backbone docking predictions. Local docking using SnugDock with the lowest-energy RosettaAntibody homology model produced more accurate predictions than standard rigid-body docking. SnugDock can be combined with ensemble docking to mimic conformer selection and induced fit resulting in increased sampling of diverse antibody conformations. The combined algorithm produced four medium (Critical Assessment of PRediction of Interactions-CAPRI rating) and seven acceptable lowest-interface-energy predictions in a test set of fifteen complexes. Structural analysis shows that diverse paratope conformations are sampled, but docked paratope backbones are not necessarily closer to the crystal structure conformations than the starting homology models. The accuracy of SnugDock predictions suggests a new genre of general docking algorithms with flexible binding interfaces targeted towards making homology models useful for further high-resolution predictions

    Recognizing protein-protein interfaces with empirical potentials and reduced amino acid alphabets.

    Get PDF
    International audienceBACKGROUND: In structural genomics, an important goal is the detection and classification of protein-protein interactions, given the structures of the interacting partners. We have developed empirical energy functions to identify native structures of protein-protein complexes among sets of decoy structures. To understand the role of amino acid diversity, we parameterized a series of functions, using a hierarchy of amino acid alphabets of increasing complexity, with 2, 3, 4, 6, and 20 amino acid groups. Compared to previous work, we used the simplest possible functional form, with residue-residue interactions and a stepwise distance-dependence. We used increased computational resources, however, constructing 290,000 decoys for 219 protein-protein complexes, with a realistic docking protocol where the protein partners are flexible and interact through a molecular mechanics energy function. The energy parameters were optimized to correctly assign as many native complexes as possible. To resolve the multiple minimum problem in parameter space, over 64000 starting parameter guesses were tried for each energy function. The optimized functions were tested by cross validation on subsets of our native and decoy structures, by blind tests on series of native and decoy structures available on the Web, and on models for 13 complexes submitted to the CAPRI structure prediction experiment. RESULTS: Performance is similar to several other statistical potentials of the same complexity. For example, the CAPRI target structure is correctly ranked ahead of 90% of its decoys in 6 cases out of 13. The hierarchy of amino acid alphabets leads to a coherent hierarchy of energy functions, with qualitatively similar parameters for similar amino acid types at all levels. Most remarkably, the performance with six amino acid classes is equivalent to that of the most detailed, 20-class energy function. CONCLUSION: This suggests that six carefully chosen amino acid classes are sufficient to encode specificity in protein-protein interactions, and provide a starting point to develop more complicated energy functions

    Quality assessment of docked protein interfaces using 3D convolution

    Get PDF
    2021 Spring.Includes bibliographical references.Proteins play a vital role in most biological processes, most of which occur through interactions between proteins. When proteins interact they form a complex, whose functionality is different from the individual proteins in the complex. Therefore understanding protein interactions and their interfaces is an important problem. Experimental methods for this task are expensive and time consuming, which has led to the development of docking methods for predicting the structures of protein complexes. These methods produce a large number of potential solutions, and the energy functions used in these methods are not good enough to find solutions that are close to the native state of the complex. Deep learning and its ability to model complex problems has opened up the opportunity to model protein complexes and learn from scratch how to rank docking solutions. As a part of this work, we have developed a 3D convolutional network approach that uses raw atomic densities to address this problem. Our method achieves performance which is on par with state-of-art methods. We have evaluated our model on docked protein structures simulated from four docking tools namely ZDOCK, HADDOCK, FRODOCK and ClusPro on targets from Docking Benchmark Data version 5 (DBD5)

    Prediction of Ligand Binding Using an Approach Designed to Accommodate Diversity in Protein-Ligand Interactions

    Get PDF
    Computational determination of protein-ligand interaction potential is important for many biological applications including virtual screening for therapeutic drugs. The novel internal consensus scoring strategy is an empirical approach with an extended set of 9 binding terms combined with a neural network capable of analysis of diverse complexes. Like conventional consensus methods, internal consensus is capable of maintaining multiple distinct representations of protein-ligand interactions. In a typical use the method was trained using ligand classification data (binding/no binding) for a single receptor. The internal consensus analyses successfully distinguished protein-ligand complexes from decoys (r2, 0.895 for a series of typical proteins). Results are superior to other tested empirical methods. In virtual screening experiments, internal consensus analyses provide consistent enrichment as determined by ROC-AUC and pROC metrics

    Cavity-based negative images in molecular docking

    Get PDF
    In drug development, computer-based methods are constantly evolving as a result of increasing computing power and cumulative costs of generating new pharmaceuticals. With virtual screening (VS), it is possible to screen even hundreds of millions of compounds and select the best molecule candidates for in vitro testing instead of investing time and resources in analysing all molecules systematically in laboratories. However, there is a constant need to generate more reliable and effective software for VS. For example, molecular docking, one of the most central methods in structure-based VS, can be a very successful approach for certain targets while failing completely with others. However, it is not necessarily the docking sampling but the scoring of the docking poses that is the bottleneck. In this thesis, a novel rescoring method, negative image-based rescoring (R-NiB), is introduced, which generates a negative image of the ligand binding cavity and compares the shape and electrostatic similarity between the generated model and the docked molecule pose. The performance of the method is tested comprehensively using several different protein targets, benchmarking sets and docking software. Additionally, it is compared to other rescoring methods. R-NiB is shown to be a fast and effective method to rescore the docking poses producing notable improvement in active molecule recognition. Furthermore, the NIB model optimization method based on a greedy algorithm is introduced that uses a set of known active and inactive molecules as a training set. This approach, brute force negative image-based optimization (BR-NiB), is shown to work remarkably well producing impressive in silico results even with very limited active molecule training sets. Importantly, the results suggest that the in silico hit rates of the optimized models in docking rescoring are on a level needed in real-world VS and drug discovery projects.Tietokoneiden laskentatehojen ja lääketutkimuksen tuotekehityskulujen kasvaessa tietokonepohjaiset menetelmät kehittyvät jatkuvasti lääkekehityksessä. Virtuaaliseulonnalla voidaan seuloa jopa satoja miljoonia molekyylejä ja valita vain parhaat molekyyliehdokkaat laboratoriotestaukseen sen sijaan, että tuhlattaisiin aikaa ja resursseja analysoimalla järjestelmällisesti kaikki molekyylit laboratoriossa. Tästä huolimatta on koko ajan jatkuva tarve kehittää luotettavampia ja tehokkaampia menetelmiä virtuaaliseulontaan. Esimerkiksi telakointi, yksi keskeisimmistä työkaluista rakennepohjaisessa lääkeainekehityksessä, saattaa toimia erinomaisesti yhdellä kohteella ja epäonnistua täysin toisella. Ongelma ei välttämättä ole telakoitujen molekyylien luonnissa vaan niiden pisteytyksessä. Tässä väitöskirjassa tähän ongelmaan esitellään ratkaisuksi uudenlainen pisteytysmenetelmä R-NiB, jossa verrataan ligandinsitomisalueen negatiivikuvan muodon ja sähköstaattisen potentiaalin samankaltaisuutta telakoituihin molekyyleihin. Menetelmän suorituskykyä testataan usealla eri molekyylisarjalla, lääkeainekohteella, telakointiohjelmalla ja vertaamalla tuloksia muihin pisteytysmenetelmiin. R-NiB:n näytetään olevan nopea ja tehokas menetelmä telakointiasentojen pisteytykseen tuottaen huomattavan parannuksen aktiivisten molekyylien tunnistukseen. Tämän lisäksi esitellään ns. ahneeseen algoritmiin perustuva negatiivikuvan optimointimenetelmä, joka käyttää sarjaa tunnettuja aktiivisia ja inaktiivisia molekyylejä harjoitusjoukkona. Tämän BR-NiB-menetelmän näytetään toimivan ainakin tietokonemallinnuksessa todella hyvin tuottaen vaikuttavia tuloksia jopa silloin, kun harjoitusjoukko koostuu vain muutamista aktiivisista molekyyleistä. Mikä tärkeintä, in silico -tulokset viittaavat optimointimenetelmän osumaprosentin telakoinnin uudelleenpisteytyksessä olevan riittävän korkea myös oikeisiin virtuaaliseulontaprojekteihin

    Assessing the structure of proteins and protein complexes through physical and statistical approaches

    Get PDF
    Determining the correct state of a protein or a protein complex is of paramount importance for current medical and pharmaceutical research. The stable conformation of such systems depend on two processes called protein folding and protein-protein interaction. In the course of the last 50 years, both processes have been fruitfully studied. Yet, a complete understanding is still not reached, and the accuracy and the efficiency of the approaches for studying these problems is not yet optimal. This thesis is devoted to devising physical and statistical methods for recognizing the native state of a protein or a protein complex. The studies will be mostly based on BACH, a knowledge-based potential originally designed for the discrimination of native structures in protein folding problems. BACH method will be analyzed and extended: first, a new method to account for protein-solvent interaction will be presented. Then, we will describe an extension of BACH aimed at assessing the quality of protein complexes in protein-protein interaction problems. Finally, we will present a procedure aimed at predicting the structure of a complex based on a hierarchy of approaches ranging from rigid docking up to molecular dynamics in explicit solvent. The reliability of the approaches we propose will be always benchmarked against a selection of other state-of-the-art scoring functions which obtained good results in CASP and CAPRI competitions

    QCSPScore: a new scoring function for driving protein-ligand docking with quantitative chemical shifts perturbations

    Get PDF
    Through the use of information about the biological target structure, the optimization of potential drugs can be improved. In this work I have developed a procedure that uses the quantitative change in the chemical perturbations (CSP) in the protein from NMR experiments for driving protein-ligand docking. The approach is based on a hybrid scoring function (QCSPScore) which combines traditional DrugScore potentials, which describe the interaction between protein and ligand, with Kendall’s rank correlation coefficient, which evaluates docking poses in terms of their agreement with experimental CSP. Prediction of the CSP for a specific ligand pose is done efficiently with an empirical model, taking into account only ring current effects. QCSPScore has been implemented in the AutoDock software package. Compared to previous methods, this approach shows that the use of rank correlation coefficient is robust to outliers. In addition, the prediction of native-like complex geometries improved because the CSP are already being used during the docking process, and not only in a post-filtering setting for generated docking poses. Since the experimental information is guaranteed to be quantitatively used, CSP effectively contribute to align the ligand in the binding pocket. The first step in the development of QCSPScore was the analysis of 70 protein-ligand complexes for which reference CSP were computed. The success rate in the docking increased from 71% without involvement of CSP to 100% if CSP were considered at the highest weighting scheme. In a second step QCSPScore was used in re-docking three test cases, for which reference experimental CSP data was available. Without CSP, i.e. in the use of conventional DrugScore potentials, none of the three test cases could be successfully re-docked. The integration of CSP with the same weighting factor as described above resulted in all three cases successfully re-docked. For two of the three complexes, native-like solutions were only produced if CSP were considered.Conformational changes in the binding pockets of up to 2 Å RMSD did not affect the success of the docking. QCSPScore will be particularly interesting in difficult protein-ligand complexes. They are in particular those cases in which the shape of the binding pocket does not provide sufficient steric restraints such as in flat protein-protein interfaces and in the virtual screening of small chemical fragments.Durch die Verwendung von Information über die biologische Zielstruktur kann die Optimierung potentieller Wirkstoffe verbessert werden. Im Rahmen dieser Arbeit habe ich ein Verfahren entwickelt, das quantitativ die Veränderung der Chemischen Verschieben (CSP) im Protein aus NMR-Experimenten für das Protein-Ligand-Docking verwendet. Der Ansatz basiert auf einer Hybridbewertungsfunktion (QCSPScore) und kombiniert herkömmliche DrugScore-Potentiale, welche die Wechselwirkung zwischen Protein und Ligand beschreiben, mit dem Rangkorrelationskoeffizienten nach Kendall, der die Dockingposen hinsichtlich ihrer Übereinstimmung mit experimentellen CSP. Die Vorhersage der CSP für einen bestimmten Liganden geschieht effizient mit einem empirischen Modell, wobei nur Ringstromeffekte berücksichtigt werden. QCSPScore wurde in das AutoDock Softwarepaket implementiert. Im Vergleich zu früheren Verfahren zeigt dieser Ansatz, dass die Verwendung des Rangkorrelationskoeffizienten robuster ist gegenüber Ausreißern in den vorhergesagten CSP. Außerdem ist die Vorhersage nativ-ähnlicher Komplexgeometrien verbessert, da die CSP bereits während des Docking-Prozesses eingesetzt werden, und nicht erst in einem nachträglichen Filter für generierte Dockingposen. Da die experimentelle Informationen quantitativ benutzt werden wird sichergestellt, dass die CSP effektiv dazu beitragen, den Liganden in der Bindetasche auszurichten. Der erste Schritt bei der Entwicklung des QCSPScore war die Analyse von 70 Protein-Ligand-Komplexen, für die als Referenz CSP vorhergesagt wurden. Die Erfolgsrate im Docking erhöhte sich von 71 %, ohne Einbeziehung von CSP, auf 100 %, wenn CSP mit höchster Gewichtung mit einbezogen wurden. Die globale Optimierung auf der kombinierten Docking-Energiehyperfläche ist also erfolgreich. In einem zweiten Schritt wurde QCSPScore zum Docking dreier Testfälle verwendet, für die als Referenz experimentelle CSP zur Verfügung standen. Ohne CSP, d.h. bei der Verwendung von herkömmlichen DrugScore-Potentialen, konnte keiner der drei Testfälle erfolgreich gedockt werden. Die Einbeziehung von CSP mit dem selben hohen Gewichtungsfaktor wie oben führte in allen drei Fällen zu erfolgreichen Docking-Ergebnissen. Für zwei der drei Komplexe wurden zudem nur bei Einbeziehung der experimentellen Information nativ-ähnliche Geometrien vorhergesagt. Konformationelle Änderungen der Bindetasche bis zu 2 Å RMSD beeinträchtigen den Erfolg des Dockings nicht. Ich bin davon überzeugt, dass mein Verfahren besonders für Protein-Ligand-Komplexe interessant sein wird, für die die Vorhersage nativ-ähnlicher Komplexe bislang schwierig war. Das sind insbesondere solche Fälle, in denen die Form der Bindetasche zur Vorhersage des Komplexes nicht ausreichend, wie das bei flachen Protein-Protein-Wechselwirkungsregionen oder beim virtuellen Screening kleiner Fragmente der Fall ist
    corecore