36 research outputs found

    Solvated interaction energy: from small-molecule to antibody drug design

    Get PDF
    Scoring functions are ubiquitous in structure-based drug design as an aid to predicting binding modes and estimating binding affinities. Ideally, a scoring function should be broadly applicable, obviating the need to recalibrate and refit its parameters for every new target and class of ligands. Traditionally, drugs have been small molecules, but in recent years biologics, particularly antibodies, have become an increasingly important if not dominant class of therapeutics. This makes the goal of having a transferable scoring function, i.e., one that spans the range of small-molecule to protein ligands, even more challenging. One such broadly applicable scoring function is the Solvated Interaction Energy (SIE), which has been developed and applied in our lab for the last 15 years, leading to several important applications. This physics-based method arose from efforts to understand the physics governing binding events, with particular care given to the role played by solvation. SIE has been used by us and many independent labs worldwide for virtual screening and discovery of novel small-molecule binders or optimization of known drugs. Moreover, without any retraining, it is found to be transferrable to predictions of antibody-antigen relative binding affinities and as accurate as functions trained on protein-protein binding affinities. SIE has been incorporated in conjunction with other scoring functions into ADAPT (Assisted Design of Antibody and Protein Therapeutics), our platform for affinity modulation of antibodies. Application of ADAPT resulted in the optimization of several antibodies with 10-to-100-fold improvements in binding affinity. Further applications included broadening the specificity of a single-domain antibody to be cross-reactive with virus variants of both SARS-CoV-1 and SARS-CoV-2, and the design of safer antibodies by engineering of a pH switch to make them more selective towards acidic tumors while sparing normal tissues at physiological pH

    Predicting binding poses and affinities for protein-ligand complexes in the 2015 D3R Grand Challenge using a physical model with a statistical parameter estimation

    Get PDF
    International audienceThe 2015 D3R Grand Challenge provided an opportunity to test our new model for the binding free energy of small molecules, as well as to assess our protocol to predict binding poses for protein-ligand complexes. Our pose predictions were ranked 3-9 for the HSP90 dataset, depending on the assessment metric. For the MAP4K dataset the ranks are very dispersed and equal to 2-35, depending on the assessment metric, which does not provide any insight into the accuracy of the method. The main success of our pose prediction protocol was the re-scoring stage using the recently developed Convex-PL potential. We make a thorough analysis of our docking predictions made with AutoDock Vina and discuss the effect of the choice of rigid receptor templates, the number of flexible residues in the binding pocket, the binding pocket size, and the benefits of re-scoring. However, the main challenge was to predict experimentally determined binding affinities for two blind test sets. Our affinity prediction model consisted of two terms, a pairwise-additive enthalpy, and a non pairwise-additive entropy. We trained the free parameters of the model with a regularized regression using affinity and structural data from the PDBBind database. Our model performed very well on the training set, however, failed on the two test sets. We explain the drawback and pitfalls of our model, in particular in terms of relative coverage of the test set by the training set and missed dynamical properties from crystal structures, and discuss different routes to improve it

    Predicting Binding Affinity of CSAR Ligands Using Both Structure-Based and Ligand-Based Approaches

    Get PDF
    We report on the prediction accuracy of ligand-based (2D QSAR) and structure-based (MedusaDock) methods used both independently and in consensus for ranking the congeneric series of ligands binding to three protein targets (UK, ERK2, and CHK1) from the CSAR 2011 benchmark exercise. An ensemble of predictive QSAR models was developed using known binders of these three targets extracted from the publicly-available ChEMBL database. Selected models were used to predict the binding affinity of CSAR compounds towards the corresponding targets and rank them accordingly; the overall ranking accuracy evaluated by Spearman correlation was as high as 0.78 for UK, 0.60 for ERK2, and 0.56 for CHK1, placing our predictions in top-10% among all the participants. In parallel, MedusaDock designed to predict reliable docking poses was also used for ranking the CSAR ligands according to their docking scores; the resulting accuracy (Spearman correlation) for UK, ERK2, and CHK1 were 0.76, 0.31, and 0.26, respectively. In addition, performance of several consensus approaches combining MedusaDock and QSAR predicted ranks altogether has been explored; the best approach yielded Spearman correlation coefficients for UK, ERK2, and CHK1 of 0.82, 0.50, and 0.45, respectively. This study shows that (i) externally validated 2D QSAR models were capable of ranking CSAR ligands at least as accurately as more computationally intensive structure-based approaches used both by us and by other groups and (ii) ligand-based QSAR models can complement structure-based approaches by boosting the prediction performances when used in consensus

    Pre-Training on Large-Scale Generated Docking Conformations with HelixDock to Unlock the Potential of Protein-ligand Structure Prediction Models

    Full text link
    Protein-ligand structure prediction is an essential task in drug discovery, predicting the binding interactions between small molecules (ligands) and target proteins (receptors). Although conventional physics-based docking tools are widely utilized, their accuracy is compromised by limited conformational sampling and imprecise scoring functions. Recent advances have incorporated deep learning techniques to improve the accuracy of structure prediction. Nevertheless, the experimental validation of docking conformations remains costly, it raises concerns regarding the generalizability of these deep learning-based methods due to the limited training data. In this work, we show that by pre-training a geometry-aware SE(3)-Equivariant neural network on a large-scale docking conformation generated by traditional physics-based docking tools and then fine-tuning with a limited set of experimentally validated receptor-ligand complexes, we can achieve outstanding performance. This process involved the generation of 100 million docking conformations, consuming roughly 1 million CPU core days. The proposed model, HelixDock, aims to acquire the physical knowledge encapsulated by the physics-based docking tools during the pre-training phase. HelixDock has been benchmarked against both physics-based and deep learning-based baselines, showing that it outperforms its closest competitor by over 40% for RMSD. HelixDock also exhibits enhanced performance on a dataset that poses a greater challenge, thereby highlighting its robustness. Moreover, our investigation reveals the scaling laws governing pre-trained structure prediction models, indicating a consistent enhancement in performance with increases in model parameters and pre-training data. This study illuminates the strategic advantage of leveraging a vast and varied repository of generated data to advance the frontiers of AI-driven drug discovery

    Docking rigid macrocycles using Convex-PL, AutoDock Vina, and RDKit in the D3R Grand Challenge 4

    Get PDF
    International audienceThe D3R Grand Challenge 4 provided a brilliant opportunity to test macrocyclic docking protocols on a diverse high-quality experimental data. We participated in both pose and affinity prediction exercises. Overall, we aimed to use an automated structure-based docking pipeline built around a set of tools developed in our team. This exercise again demonstrated a crucial importance of the correct local ligand geometry for the overall success of docking. Starting from the second part of the pose prediction stage, we developed a stable pipeline for sampling macrocycle conformers. This resulted in the subangstrom average precision of our pose predictions. In the affinity prediction exercise we obtained average results. However, we could improve these when using docking poses submitted by the best predictors. Our docking tools including the Convex-PL scoring function are available at https://team.inria.fr/nano-d/software/

    Modeling Protein-Ligand Interactions with Applications to Drug Design

    Get PDF

    3D Convolutional Neural Networks for Computational Drug Discovery

    Get PDF
    This thesis describes aspects of the implementation and application of voxel-based con- volutional neural networks (CNNs) to problems in computational drug discovery. It opens by justifying the novelty of this approach by presenting a more mainstream approach to the common tasks of virtual screening and binding pose prediction, augmented with more sim- plistic machine learning methods, and demonstrating their suboptimal performance when applied prospectively. It then describes my contributions to our group’s development of voxel-based CNNs as we honed their implementation and training strategy, and reports our library that facilitates featurization and training using this approach. It continues with a prospective assessment of their performance, analogous to the first prospective evaluation, with the addition of a novel CNN-based pose sampling strategy. Next it makes a foray into model explanation, first in an oblique fashion, by examining the transferability of models to tasks that are distinct from but related to the tasks for which they were trained, and by a comparison with an approach based on exploiting dataset bias using other machine learning methods. Finally it describes the implementation of a more direct approach to model ex- planation, by using a trained network to perform optimization of inputs with respect to the network as a whole or individual nodes and analyzing the content of the result as well as its utility as a pseudo-pharmacophore

    A teach-discover-treat application of ZincPharmer: An online interactive pharmacophore modeling and virtual screening tool

    Get PDF
    The 2012 Teach-Discover-Treat (TDT) community-wide experiment provided a unique opportunity to test prospective virtual screening protocols targeting the anti-malarial target dihydroorotate dehydrogenase (DHODH). Facilitated by ZincPharmer, an open access online interactive pharmacophore search of the ZINC database, the experience resulted in the development of a novel classification scheme that successfully predicted the bound structure of a non-triazolopyrimidine inhibitor, as well as an overall hit rate of 27% of tested active compounds from multiple novel chemical scaffolds. The general approach entailed exhaustively building and screening sparse pharmacophore models comprising of a minimum of three features for each bound ligand in all available DHODH co-crystals and iteratively adding features that increased the number of known binders returned by the query. Collectively, the TDT experiment provided a unique opportunity to teach computational methods of drug discovery, develop innovative methodologies and prospectively discover new compounds active against DHODH. Copyright
    corecore