Search CORE

2 research outputs found

Generate What You Can Make: Achieving in-house synthesizability with readily available resources in de novo drug design

Author: Alan Kai Hassen
Andrius Bernatavicius
Antonius P.A. Janssen
Darcy N.R. Reynolds
Djork-Arné Clevert
Gerard J.P. van Westen
Martin Sicho
Mike Preuss
Mirjam C.W. Huizenga
Sohvi Luukkonen
Yorick J. van Aalst
Publication venue
Publication date: 05/03/2024
Field of study

Molecules generated by Computer-Aided Drug Design often lack synthesizability to be valuable because Computer-Aided Synthesis Planning (CASP) and CASP-based approximated synthesizability scores have rarely been used as generation objectives, despite facilitating the in-silico generation of synthesizable molecules. Published scores approximate a general notion of CASP-based synthesizability with nearly unlimited building block resources. However, this approach is disconnected from the reality of small laboratory drug design, where building block resources are limited, making a notion of in-house synthesizability that uses already available resources highly desirable. In this work, we show a successful de novo drug design workflow generating active and in-house synthesizable ligands of monoglyceride lipase (MGLL). We demonstrate the successful transfer of CASP from 17.4 million commercial building blocks to a small laboratory setting of roughly 6,000 building blocks with only a decrease of -12% in CASP success. Moreover, we present a rapidly retrainable in-house synthesizability score, successfully capturing our in-house synthesizability without relying on external building block resources. We show that including our in-house synthesizability score in a multi-objective de novo drug design workflow, alongside a simple QSAR model, provides thousands of potentially active and easily in-house synthesizable molecules. Further, we highlight differences between general and in-house synthesizability scores and demonstrate potential problems with the out-of-distribution predictive performance of synthesizability scores on generated molecules. Finally, we experimentally evaluate the synthesis and biochemical activity of three de novo candidates using their CASP-suggested synthesis routes using only in-house building blocks. We find one candidate with evident activity, suggesting potential new ligand ideas for MGLL inhibitors while showcasing the usefulness of our in-house synthesizability score

ChemRxiv

QSPRpred: a Flexible Open-Source Quantitative Structure-Property Relationship Modelling Tool

Author: Andrius Bernatavicius
David Alencar Araripe
Gerard J. P. van Westen
Helle W. van den Maagdenberg
J. G. Coen van Hasselt
Linde Schoenmaker
Marina Gorostiola González
Martin Šícho
Michiel Jespers
Olivier J. M. Béquignon
Piet. H. van der Graaf
Remco L. van den Broek
Sohvi Luukkonen
Publication venue
Publication date: 05/03/2024
Field of study

Building reliable and robust quantitative structure-property relationship (QSPR) models is a challenging task. First, the experimental data needs to be obtained, analyzed and curated. Second, the number of available methods is continuously growing and evaluating different algorithms and methodologies can be arduous. Finally, the last hurdle that researchers face is to ensure the reproducibility of their models and facilitate their transferability into practice. In this work, we introduce QSPRpred, a toolkit for analysis of bioactivity data sets and QSPR modelling, which attempts to address the aforementioned challenges. QSPRpred\u27s modular Python API enables users to intuitively describe different parts of a modelling workflow using a plethora of pre-implemented components, but also integrate customized implementations in a "plug-and-play" manner. QSPRpred data sets and models are directly serializable, which means they can be readily reproduced and put into operation after training as the models are saved with all required data pre-processing steps to make predictions on new compounds directly from SMILES strings. The general-purpose character of QSPRpred is also demonstrated by inclusion of support for multi-task and proteochemometric modelling. The package is extensively documented and comes with a large collection of tutorials to help new users. In this paper, we describe all of QSPRpred\u27s functionalities and also conduct a small benchmarking case study to illustrate how different components can be leveraged to compare a diverse set of models. QSPRpred is fully open-source and available at https://github.com/CDDLeiden/QSPRpred. Scientific Contribution QSPRpred aims to provide a complex, but comprehensive Python API to conduct all tasks encountered in QSPR modelling from data preparation and analysis to model creation and model deployment. In contrast to similar packages, QSPRpred offers a wider and more exhaustive range of capabilities and integrations with many popular packages that also go beyond QSPR modelling. A significant contribution of QSPRpred is also in its automated and highly standardized serialization scheme, which significantly improves reproducibility and transferability of models

ChemRxiv