18 research outputs found
Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks
In de novo drug design, computational strategies are used to generate novel
molecules with good affinity to the desired biological target. In this work, we
show that recurrent neural networks can be trained as generative models for
molecular structures, similar to statistical language models in natural
language processing. We demonstrate that the properties of the generated
molecules correlate very well with the properties of the molecules used to
train the model. In order to enrich libraries with molecules active towards a
given biological target, we propose to fine-tune the model with small sets of
molecules, which are known to be active against that target.
Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test
molecules that medicinal chemists designed, whereas against Plasmodium
falciparum (Malaria) it reproduced 28% of 1240 test molecules. When coupled
with a scoring function, our model can perform the complete de novo drug design
cycle to generate large sets of novel molecules for drug discovery.Comment: 17 pages, 17 figure
ChemTS: An Efficient Python Library for de novo Molecular Generation
Automatic design of organic materials requires black-box optimization in a
vast chemical space. In conventional molecular design algorithms, a molecule is
built as a combination of predetermined fragments. Recently, deep neural
network models such as variational auto encoders (VAEs) and recurrent neural
networks (RNNs) are shown to be effective in de novo design of molecules
without any predetermined fragments. This paper presents a novel python library
ChemTS that explores the chemical space by combining Monte Carlo tree search
(MCTS) and an RNN. In a benchmarking problem of optimizing the octanol-water
partition coefficient and synthesizability, our algorithm showed superior
efficiency in finding high-scoring molecules. ChemTS is available at
https://github.com/tsudalab/ChemTS
The Trends of De Novo Molecular Designs in the Twenty-First Century: A Mini-Review
The inception of advanced bioactive agents has driven the growth for sustained drug delivery and the boom of new medicines. The future of the medical and chemical biology relies on the amalgamation of the advanced systematic and analytical techniques, which shall be tethered together with a robust theoretical framework. The de novo drug design is one of such exciting strategies that use computational theories to generate novel molecules with a good affinity to the desired biological target. This mini-review provides a basic overview of the current trends and algorithms, which aids in the advancement of the de novo molecular framework
Retrosynthetic reaction prediction using neural sequence-to-sequence models
We describe a fully data driven model that learns to perform a retrosynthetic
reaction prediction task, which is treated as a sequence-to-sequence mapping
problem. The end-to-end trained model has an encoder-decoder architecture that
consists of two recurrent neural networks, which has previously shown great
success in solving other sequence-to-sequence prediction tasks such as machine
translation. The model is trained on 50,000 experimental reaction examples from
the United States patent literature, which span 10 broad reaction types that
are commonly used by medicinal chemists. We find that our model performs
comparably with a rule-based expert system baseline model, and also overcomes
certain limitations associated with rule-based expert systems and with any
machine learning approach that contains a rule-based expert system component.
Our model provides an important first step towards solving the challenging
problem of computational retrosynthetic analysis