17 research outputs found

    Evaluation and Optimization of Virtual Screening Workflows with DEKOIS 2.0 – A Public Library of Challenging Docking Benchmark Sets

    No full text
    The application of molecular benchmarking sets helps to assess the actual performance of virtual screening (VS) workflows. To improve the efficiency of structure-based VS approaches, the selection and optimization of various parameters can be guided by benchmarking. With the DEKOIS 2.0 library, we aim to further extend and complement the collection of publicly available decoy sets. Based on BindingDB bioactivity data, we provide 81 new and structurally diverse benchmark sets for a wide variety of different target classes. To ensure a meaningful selection of ligands, we address several issues that can be found in bioactivity data. We have improved our previously introduced DEKOIS methodology with enhanced physicochemical matching, now including the consideration of molecular charges, as well as a more sophisticated elimination of latent actives in the decoy set (LADS). We evaluate the docking performance of Glide, GOLD, and AutoDock Vina with our data sets and highlight existing challenges for VS tools. All DEKOIS 2.0 benchmark sets will be made accessible at http://www.dekois.com

    Using Surface Scans for the Evaluation of Halogen Bonds toward the Side Chains of Aspartate, Asparagine, Glutamate, and Glutamine

    No full text
    Using halogen-specific Connolly type molecular surfaces, we herein invented a new type of surface-based interaction analysis employed for the study of halogen bonding toward model systems of biologically relevant carboxylates (ASP/GLU) and carboxamides (ASN/GLN). Database mining and statistical assessment of the PDB revealed that such interactions are widely underrepresented at the moment. We observed important distance-dependent adaptions of the binding modes of halobenzenes from a preferential oxygen-directed to a bifurcated interaction geometry of the carboxylate. In addition, halogen···π contacts perpendicular to the nitrogen atom of the carboxamide become increasingly important for the lighter halogens. Our analysis on a MP2/TZVPP level of theory is backed by CCSD­(T)/CBS reference calculations. To put the vast interaction energies into perspective, we also performed COSMO-RS calculations of the solvation free energy. Facilitating the visualization of our results mapped onto any binding site of choice, we aim to inspire more design studies showcasing these underrepresented interactions

    Targeting Histidine Side Chains in Molecular Design through Nitrogen–Halogen Bonds

    No full text
    Halogen bonds are directional noncovalent interactions that can be used to target electron donors in a protein binding site. In this study, we employ quantum chemical calculations to explore halogen···nitrogen contacts involving histidine side chains. We characterize the energetics on the MP2 level of theory using SCS-MP2 and CCSD­(T)/CBS as reference calculations and elucidate their energy profile in suboptimal geometries. We derive simple rules allowing medicinal chemists and chemical biologists to easily determine preferred areas of interaction in a binding site and exploit them for scaffold decoration and design. Our work shows that nitrogen–halogen bonds are valuable interactions that are this far underexploited in patent applications, lead structure, and clinical candidate selection. We highlight their potential to increase binding affinities and suggest that they can significantly contribute to inducing and tuning subtype selectivities

    Machine Learning Estimates of Natural Product Conformational Energies

    Get PDF
    <div><p>Machine learning has been used for estimation of potential energy surfaces to speed up molecular dynamics simulations of small systems. We demonstrate that this approach is feasible for significantly larger, structurally complex molecules, taking the natural product Archazolid A, a potent inhibitor of vacuolar-type ATPase, from the myxobacterium <i>Archangium gephyra</i> as an example. Our model estimates energies of new conformations by exploiting information from previous calculations via Gaussian process regression. Predictive variance is used to assess whether a conformation is in the interpolation region, allowing a controlled trade-off between prediction accuracy and computational speed-up. For energies of relaxed conformations at the density functional level of theory (implicit solvent, DFT/BLYP-disp3/def2-TZVP), mean absolute errors of less than 1 kcal/mol were achieved. The study demonstrates that predictive machine learning models can be developed for structurally complex, pharmaceutically relevant compounds, potentially enabling considerable speed-ups in simulations of larger molecular structures.</p></div

    Performance of ML models trained separately on each individual MD run and tested on the other MD runs.

    No full text
    <p>RMSE: root mean square error (kJ/mol), MAE: mean absolute error (kJ/mol), MAE (%): MAE as a percentage of the range of training set energy values, <i>R</i><sup>2</sup>: squared Pearson correlation coefficient.</p

    Influence of sampling.

    No full text
    <p>Shown are smoothed PCA maps of absolute prediction errors for ML models trained on individual MD data (top row) and ML models trained on randomized subsets of all MD data (bottom row). Color indicates magnitude of error (blue = low, red = high); training samples are shown as black dots.</p

    Learning using predictive variance.

    No full text
    <p>Shown is the trade-off between mean absolute error (MAE, solid line, left scale) and number of predicted conformations (<i>m</i>, dashed line, right scale). Results are averaged over all possible orderings of the four MD runs (4! = 24; standard deviations ca. 0.4 kJ/mol and 35 samples). Squared correlation is <i>R</i><sup>2</sup> = 0.99.</p

    Projection of MD conformations of Archazolid A onto two dimensions (, ) by principal component analysis.

    No full text
    <p>Shown are distribution of individual conformations (left) and smoothed energy landscape generated by LiSARD <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003400#pcbi.1003400-Reutlinger1" target="_blank">[52]</a> (right). Labels indicate reported NMR-motivated structures (A = <i>c5a</i>, B = <i>c5b</i>, P = <i>nmr</i>) and lowest-energy MD conformations (8, 595, 40). Color coding is from lowest (blue) to highest (red) relative energy.</p

    Performance of ML models trained on randomized subsets of increasing size of the complete MD data.

    No full text
    <p>See <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003400#pcbi-1003400-t001" target="_blank">Table 1</a> for abbreviations.</p
    corecore