Search CORE

18 research outputs found

Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks

Author: Kogej Thierry
Segler Marwin H. S.
Tyrchan Christian
Waller Mark P.
Publication venue
Publication date: 05/01/2017
Field of study

In de novo drug design, computational strategies are used to generate novel molecules with good affinity to the desired biological target. In this work, we show that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing. We demonstrate that the properties of the generated molecules correlate very well with the properties of the molecules used to train the model. In order to enrich libraries with molecules active towards a given biological target, we propose to fine-tune the model with small sets of molecules, which are known to be active against that target. Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test molecules that medicinal chemists designed, whereas against Plasmodium falciparum (Malaria) it reproduced 28% of 1240 test molecules. When coupled with a scoring function, our model can perform the complete de novo drug design cycle to generate large sets of novel molecules for drug discovery.Comment: 17 pages, 17 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

ChemTS: An Efficient Python Library for de novo Molecular Generation

Author: Terayama Kei
Tsuda Koji
Yang Xiufeng
Yoshizoe Kazuki
Zhang Jinzhe
Publication venue: 'Informa UK Limited'
Publication date: 29/09/2017
Field of study

Automatic design of organic materials requires black-box optimization in a vast chemical space. In conventional molecular design algorithms, a molecule is built as a combination of predetermined fragments. Recently, deep neural network models such as variational auto encoders (VAEs) and recurrent neural networks (RNNs) are shown to be effective in de novo design of molecules without any predetermined fragments. This paper presents a novel python library ChemTS that explores the chemical space by combining Monte Carlo tree search (MCTS) and an RNN. In a benchmarking problem of optimizing the octanol-water partition coefficient and synthesizability, our algorithm showed superior efficiency in finding high-scoring molecules. ChemTS is available at https://github.com/tsudalab/ChemTS

arXiv.org e-Print Archive

Directory of Open Access Journals

The Trends of De Novo Molecular Designs in the Twenty-First Century: A Mini-Review

Author: Basak Sayan
Publication venue: 'Scholink Co, Ltd.'
Publication date: 24/03/2020
Field of study

The inception of advanced bioactive agents has driven the growth for sustained drug delivery and the boom of new medicines. The future of the medical and chemical biology relies on the amalgamation of the advanced systematic and analytical techniques, which shall be tethered together with a robust theoretical framework. The de novo drug design is one of such exciting strategies that use computational theories to generate novel molecules with a good affinity to the desired biological target. This mini-review provides a basic overview of the current trends and algorithms, which aids in the advancement of the de novo molecular framework

Scholink Journals

Retrosynthetic reaction prediction using neural sequence-to-sequence models

Author: Gomes Joseph
Ho Stephen
Kawthekar Prasad
Liu Bowen
Nguyen Quang Luu
Pande Vijay
Ramsundar Bharath
Shi Jade
Sloane Jack
Wender Paul
Publication venue
Publication date: 06/06/2017
Field of study

We describe a fully data driven model that learns to perform a retrosynthetic reaction prediction task, which is treated as a sequence-to-sequence mapping problem. The end-to-end trained model has an encoder-decoder architecture that consists of two recurrent neural networks, which has previously shown great success in solving other sequence-to-sequence prediction tasks such as machine translation. The model is trained on 50,000 experimental reaction examples from the United States patent literature, which span 10 broad reaction types that are commonly used by medicinal chemists. We find that our model performs comparably with a rule-based expert system baseline model, and also overcomes certain limitations associated with rule-based expert systems and with any machine learning approach that contains a rule-based expert system component. Our model provides an important first step towards solving the challenging problem of computational retrosynthetic analysis

arXiv.org e-Print Archive

Directory of Open Access Journals