Machine Learning Small Molecule Properties in Drug Discovery

Arroniz, Carlos; De Fabritiis, Gianni; Majewski, Maciej; Schapin, Nikolai; Varela, Alejandro

Machine Learning Small Molecule Properties in Drug Discovery

Authors: Carlos Arroniz
Gianni De Fabritiis
Maciej Majewski
Nikolai Schapin
Alejandro Varela
Publication date: 2 August 2023
Publisher

Abstract

Machine learning (ML) is a promising approach for predicting small molecule properties in drug discovery. Here, we provide a comprehensive overview of various ML methods introduced for this purpose in recent years. We review a wide range of properties, including binding affinities, solubility, and ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity). We discuss existing popular datasets and molecular descriptors and embeddings, such as chemical fingerprints and graph-based neural networks. We highlight also challenges of predicting and optimizing multiple properties during hit-to-lead and lead optimization stages of drug discovery and explore briefly possible multi-objective optimization techniques that can be used to balance diverse properties while optimizing lead candidates. Finally, techniques to provide an understanding of model predictions, especially for critical decision-making in drug discovery are assessed. Overall, this review provides insights into the landscape of ML models for small molecule property predictions in drug discovery. So far, there are multiple diverse approaches, but their performances are often comparable. Neural networks, while more flexible, do not always outperform simpler models. This shows that the availability of high-quality training data remains crucial for training accurate models and there is a need for standardized benchmarks, additional performance metrics, and best practices to enable richer comparisons between the different techniques and models that can shed a better light on the differences between the many techniques.Comment: 46 pages, 1 figur

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2308.12354

Last time updated on 08/09/2023