20 research outputs found

    Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition

    No full text
    Inspired by natural language processing techniques, we here introduce Mol2vec, which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Like the Word2vec models, where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that point in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing the vectors of the individual substructures and, for instance, be fed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pretrained once, yields dense vector representations, and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as a reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment-independent and thus can also be easily used for proteins with low sequence similarities

    From Cancer to Pain Target by Automated Selectivity Inversion of a Clinical Candidate

    No full text
    Elimination of inadvertent binding is crucial for inhibitor design targeting conserved protein classes like kinases. Compounds in clinical trials provide a rich source for initiating drug design efforts by exploiting such secondary binding events. Considering both aspects, we shifted the selectivity of tozasertib, originally developed against AurA as cancer target, toward the pain target TrkA. First, selectivity-determining features in binding pockets were identified by fusing interaction grids of several key and off-target conformations. A focused library was subsequently created and prioritized using a multiobjective selection scheme that filters for selective and highly active compounds based on orthogonal methods grounded in computational chemistry and machine learning. Eighteen high-ranking compounds were synthesized and experimentally tested. The top-ranked compound has 10000-fold improved selectivity versus AurA, nanomolar cellular activity, and is highly selective in a kinase panel. This was achieved in a single round of automated in silico optimization, highlighting the power of recent advances in computer-aided drug design to automate design and selection processes

    Coupling Matched Molecular Pairs with Machine Learning for Virtual Compound Optimization

    No full text
    Matched molecular pair (MMP) analyses are widely used in compound optimization projects to gain insights into structure–activity relationships (SAR). The analysis is traditionally done via statistical methods but can also be employed together with machine learning (ML) approaches to extrapolate to novel compounds. The here introduced MMP/ML method combines a fragment-based MMP implementation with different machine learning methods to obtain automated SAR decomposition and prediction. To test the prediction capabilities and model transferability, two different compound optimization scenarios were designed: (1) “new fragments” which occurs when exploring new fragments for a defined compound series and (2) “new static core and transformations” which resembles for instance the identification of a new compound series. Very good results were achieved by all employed machine learning methods especially for the new fragments case, but overall deep neural network models performed best, allowing reliable predictions also for the new static core and transformations scenario, where comprehensive SAR knowledge of the compound series is missing. Furthermore, we show that models trained on all available data have a higher generalizability compared to models trained on focused series and can extend beyond chemical space covered in the training data. Thus, coupling MMP with deep neural networks provides a promising approach to make high quality predictions on various data sets and in different compound optimization scenarios

    From Cancer to Pain Target by Automated Selectivity Inversion of a Clinical Candidate

    No full text
    Elimination of inadvertent binding is crucial for inhibitor design targeting conserved protein classes like kinases. Compounds in clinical trials provide a rich source for initiating drug design efforts by exploiting such secondary binding events. Considering both aspects, we shifted the selectivity of tozasertib, originally developed against AurA as cancer target, toward the pain target TrkA. First, selectivity-determining features in binding pockets were identified by fusing interaction grids of several key and off-target conformations. A focused library was subsequently created and prioritized using a multiobjective selection scheme that filters for selective and highly active compounds based on orthogonal methods grounded in computational chemistry and machine learning. Eighteen high-ranking compounds were synthesized and experimentally tested. The top-ranked compound has 10000-fold improved selectivity versus AurA, nanomolar cellular activity, and is highly selective in a kinase panel. This was achieved in a single round of automated in silico optimization, highlighting the power of recent advances in computer-aided drug design to automate design and selection processes

    Identification and Visualization of Kinase-Specific Subpockets

    No full text
    The identification and design of selective compounds is important for the reduction of unwanted side effects as well as for the development of tool compounds for target validation studies. This is, in particular, true for therapeutically important protein families that possess conserved folds and have numerous members such as kinases. To support the design of selective kinase inhibitors, we developed a novel approach that allows identification of specificity determining subpockets between closely related kinases solely based on their three-dimensional structures. To account for the intrinsic flexibility of the proteins, multiple X-ray structures of the target protein of interest as well as of unwanted off-target(s) are taken into account. The binding pockets of these protein structures are calculated and fused to a combined target and off-target pocket, respectively. Subsequently, shape differences between these two combined pockets are identified via fusion rules. The approach provides a user-friendly visualization of target-specific areas in a binding pocket which should be explored when designing selective compounds. Furthermore, the approach can be easily combined with in silico alanine mutation studies to identify selectivity determining residues. The potential impact of the approach is demonstrated in four retrospective experiments on closely related kinases, i.e., p38α vs Erk2, PAK1 vs PAK4, ITK vs AurA, and BRAF vs VEGFR2. Overall, the presented approach does not require any profiling data for training purposes, provides an intuitive visualization of a large number of protein structures at once, and could also be applied to other target classes

    Additional file 1: of KinMap: a web-based tool for interactive navigation through human kinome data

    No full text
    (KinMap_Examples.zip) contains the input CSV files used to generate the annotated kinome trees in Fig. 1 (Example_1_Erlotinib_NSCLC.csv), Fig. 2a (Example_2_Sunitinib_Sorafenib_Cancer.csv), and Fig. 2b (Example_3_Kinase_Stats.csv). (ZIP 5 kb

    Function of the d‑Alanine:d‑Alanine Ligase Lid Loop: A Molecular Modeling and Bioactivity Study

    No full text
    d-Alanine:d-alanine ligase (Ddl) is an essential ATP-dependent bacterial enzyme involved in peptidoglycan biosynthesis. Discovery of Ddl inhibitors not competitive with ATP has proven to be difficult because the Ddl bimolecular d-alanine binding pocket is very restricted, as is accessibility to the active site for larger molecules in the catalytically active closed conformation of Ddl. A molecular dynamics study of the opening and closing of the Ddl lid loop informs future structure-based design efforts that allow for the flexibility of Ddl. A virtual screen on generated enzyme conformations yielded some hit inhibitors whose bioactivity was determined

    Selective Inhibitors of Aldo-Keto Reductases AKR1C1 and AKR1C3 Discovered by Virtual Screening of a Fragment Library

    No full text
    Human aldo-keto reductases 1C1–1C4 (AKR1C1–AKR1C4) function in vivo as 3-keto-, 17-keto-, and 20-ketosteroid reductases and regulate the activity of androgens, estrogens, and progesterone and the occupancy and transactivation of their corresponding receptors. Aberrant expression and action of AKR1C enzymes can lead to different pathophysiological conditions. AKR1C enzymes thus represent important targets for development of new drugs. We performed a virtual high-throughput screen of a fragment library that was followed by biochemical evaluation on AKR1C1–AKR1C4 enzymes. Twenty-four structurally diverse compounds were discovered with low μM <i>K</i><sub>i</sub> values for AKR1C1, AKR1C3, or both. Two structural series included the salicylates and the <i>N</i>-phenylanthranilic acids, and additionally a series of inhibitors with completely novel scaffolds was discovered. Two of the best selective AKR1C3 inhibitors had <i>K</i><sub>i</sub> values of 0.1 and 2.7 μM, exceeding expected activity for fragments. The compounds identified represent an excellent starting point for further hit-to-lead development
    corecore