Search CORE

22 research outputs found

Mol2vec: Unsupervised Machine Learning Approach with Chemical Intuition

Author: Sabrina Jaeger (4525846)
Samo Turk (351944)
Simone Fulle (1294221)
Publication venue
Publication date: 23/10/2017
Field of study

Inspired by natural language processing techniques, we here introduce Mol2vec, which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Like the Word2vec models, where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that point in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing the vectors of the individual substructures and, for instance, be fed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pretrained once, yields dense vector representations, and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as a reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment-independent and thus can also be easily used for proteins with low sequence similarities

FigShare

From Cancer to Pain Target by Automated Selectivity Inversion of a Clinical Candidate

Author: Benjamin Merget (230356)
Sameh Eid (806707)
Samo Turk (351944)
Simone Fulle (1294221)
Publication venue
Publication date
Field of study

Elimination of inadvertent binding is crucial for inhibitor design targeting conserved protein classes like kinases. Compounds in clinical trials provide a rich source for initiating drug design efforts by exploiting such secondary binding events. Considering both aspects, we shifted the selectivity of tozasertib, originally developed against AurA as cancer target, toward the pain target TrkA. First, selectivity-determining features in binding pockets were identified by fusing interaction grids of several key and off-target conformations. A focused library was subsequently created and prioritized using a multiobjective selection scheme that filters for selective and highly active compounds based on orthogonal methods grounded in computational chemistry and machine learning. Eighteen high-ranking compounds were synthesized and experimentally tested. The top-ranked compound has 10000-fold improved selectivity versus AurA, nanomolar cellular activity, and is highly selective in a kinase panel. This was achieved in a single round of automated in silico optimization, highlighting the power of recent advances in computer-aided drug design to automate design and selection processes

FigShare

From Cancer to Pain Target by Automated Selectivity Inversion of a Clinical Candidate

Author: Benjamin Merget (230356)
Sameh Eid (806707)
Samo Turk (351944)
Simone Fulle (1294221)
Publication venue
Publication date
Field of study

FigShare

Coupling Matched Molecular Pairs with Machine Learning for Virtual Compound Optimization

Author: Benjamin Merget (230356)
Friedrich Rippmann (1444003)
Samo Turk (351944)
Simone Fulle (1294221)
Publication venue
Publication date
Field of study

Matched molecular pair (MMP) analyses are widely used in compound optimization projects to gain insights into structure–activity relationships (SAR). The analysis is traditionally done via statistical methods but can also be employed together with machine learning (ML) approaches to extrapolate to novel compounds. The here introduced MMP/ML method combines a fragment-based MMP implementation with different machine learning methods to obtain automated SAR decomposition and prediction. To test the prediction capabilities and model transferability, two different compound optimization scenarios were designed: (1) “new fragments” which occurs when exploring new fragments for a defined compound series and (2) “new static core and transformations” which resembles for instance the identification of a new compound series. Very good results were achieved by all employed machine learning methods especially for the new fragments case, but overall deep neural network models performed best, allowing reliable predictions also for the new static core and transformations scenario, where comprehensive SAR knowledge of the compound series is missing. Furthermore, we show that models trained on all available data have a higher generalizability compared to models trained on focused series and can extend beyond chemical space covered in the training data. Thus, coupling MMP with deep neural networks provides a promising approach to make high quality predictions on various data sets and in different compound optimization scenarios

FigShare

From Cancer to Pain Target by Automated Selectivity Inversion of a Clinical Candidate

Author: Benjamin Merget (230356)
Sameh Eid (806707)
Samo Turk (351944)
Simone Fulle (1294221)
Publication venue
Publication date
Field of study

FigShare

Identification and Visualization of Kinase-Specific Subpockets

Author: Andrea Volkamer (1444000)
Friedrich Rippmann (1444003)
Sameh Eid (806707)
Samo Turk (351944)
Simone Fulle (1294221)
Publication venue
Publication date
Field of study

The identification and design of selective compounds is important for the reduction of unwanted side effects as well as for the development of tool compounds for target validation studies. This is, in particular, true for therapeutically important protein families that possess conserved folds and have numerous members such as kinases. To support the design of selective kinase inhibitors, we developed a novel approach that allows identification of specificity determining subpockets between closely related kinases solely based on their three-dimensional structures. To account for the intrinsic flexibility of the proteins, multiple X-ray structures of the target protein of interest as well as of unwanted off-target(s) are taken into account. The binding pockets of these protein structures are calculated and fused to a combined target and off-target pocket, respectively. Subsequently, shape differences between these two combined pockets are identified via fusion rules. The approach provides a user-friendly visualization of target-specific areas in a binding pocket which should be explored when designing selective compounds. Furthermore, the approach can be easily combined with in silico alanine mutation studies to identify selectivity determining residues. The potential impact of the approach is demonstrated in four retrospective experiments on closely related kinases, i.e., p38α vs Erk2, PAK1 vs PAK4, ITK vs AurA, and BRAF vs VEGFR2. Overall, the presented approach does not require any profiling data for training purposes, provides an intuitive visualization of a large number of protein structures at once, and could also be applied to other target classes

FigShare

Additional file 1: of KinMap: a web-based tool for interactive navigation through human kinome data

Author: Andrea Volkamer (1444000)
Friedrich Rippmann (1444003)
Sameh Eid (806707)
Samo Turk (351944)
Simone Fulle (1294221)
Publication venue
Publication date
Field of study

(KinMap_Examples.zip) contains the input CSV files used to generate the annotated kinome trees in Fig. 1 (Example_1_Erlotinib_NSCLC.csv), Fig. 2a (Example_2_Sunitinib_Sorafenib_Cancer.csv), and Fig. 2b (Example_3_Kinase_Stats.csv). (ZIP 5 kb

FigShare

Profiling Prediction of Kinase Inhibitors: Toward the Virtual Assay

Author: Benjamin Merget (230356)
Friedrich Rippmann (1444003)
Sameh Eid (806707)
Samo Turk (351944)
Simone Fulle (1294221)
Publication venue
Publication date
Field of study

Kinome-wide screening would have the advantage of providing structure–activity relationships against hundreds of targets simultaneously. Here, we report the generation of ligand-based activity prediction models for over 280 kinases by employing Machine Learning methods on an extensive data set of proprietary bioactivity data combined with open data. High quality (AUC > 0.7) was achieved for ∼200 kinases by (1) combining open with proprietary data, (2) choosing Random Forest over alternative tested Machine Learning methods, and (3) balancing the training data sets. Tests on left-out and external data indicate a high value for virtual screening projects. Importantly, the derived models are evenly distributed across the kinome tree, allowing reliable profiling prediction for all kinase branches. The prediction quality was further improved by employing experimental bioactivity fingerprints of a small kinase subset. Overall, the generated models can support various hit identification tasks, including virtual screening, compound repurposing, and the detection of potential off-targets

FigShare

Function of the d‑Alanine:d‑Alanine Ligase Lid Loop: A Molecular Modeling and Bioactivity Study

Author: Blaž Vehar (2032378)
Dušanka Janežič (1478134)
Janez Konc (348025)
Martina Hrast (145538)
Samo Turk (351944)
Stanislav Gobec (111843)
Publication venue
Publication date
Field of study

d-Alanine:d-alanine ligase (Ddl) is an essential ATP-dependent bacterial enzyme involved in peptidoglycan biosynthesis. Discovery of Ddl inhibitors not competitive with ATP has proven to be difficult because the Ddl bimolecular d-alanine binding pocket is very restricted, as is accessibility to the active site for larger molecules in the catalytically active closed conformation of Ddl. A molecular dynamics study of the opening and closing of the Ddl lid loop informs future structure-based design efforts that allow for the flexibility of Ddl. A virtual screen on generated enzyme conformations yielded some hit inhibitors whose bioactivity was determined

FigShare

Selective Inhibitors of Aldo-Keto Reductases AKR1C1 and AKR1C3 Discovered by Virtual Screening of a Fragment Library

Author: Adegoke O. Adeniji (1294296)
Dušanka Janežič (1478134)
Janez Konc (348025)
Petra Brožič (2044423)
Samo Turk (351944)
Stanislav Gobec (111843)
Tea Lanišnik Rižner (2044426)
Trevor M. Penning (770742)
Publication venue
Publication date
Field of study

Human aldo-keto reductases 1C1–1C4 (AKR1C1–AKR1C4) function in vivo as 3-keto-, 17-keto-, and 20-ketosteroid reductases and regulate the activity of androgens, estrogens, and progesterone and the occupancy and transactivation of their corresponding receptors. Aberrant expression and action of AKR1C enzymes can lead to different pathophysiological conditions. AKR1C enzymes thus represent important targets for development of new drugs. We performed a virtual high-throughput screen of a fragment library that was followed by biochemical evaluation on AKR1C1–AKR1C4 enzymes. Twenty-four structurally diverse compounds were discovered with low μM Ki values for AKR1C1, AKR1C3, or both. Two structural series included the salicylates and the N-phenylanthranilic acids, and additionally a series of inhibitors with completely novel scaffolds was discovered. Two of the best selective AKR1C3 inhibitors had Ki values of 0.1 and 2.7 μM, exceeding expected activity for fragments. The compounds identified represent an excellent starting point for further hit-to-lead development

FigShare