72 research outputs found
Mol-CycleGAN - a generative model for molecular optimization
Designing a molecule with desired properties is one of the biggest challenges
in drug development, as it requires optimization of chemical compound
structures with respect to many complex properties. To augment the compound
design process we introduce Mol-CycleGAN - a CycleGAN-based model that
generates optimized compounds with high structural similarity to the original
ones. Namely, given a molecule our model generates a structurally similar one
with an optimized value of the considered property. We evaluate the performance
of the model on selected optimization objectives related to structural
properties (presence of halogen groups, number of aromatic rings) and to a
physicochemical property (penalized logP). In the task of optimization of
penalized logP of drug-like molecules our model significantly outperforms
previous results
Mol-CycleGAN: a generative model for molecular optimization
Designing a molecule with desired properties is one of the biggest challenges in drug development, as it requires optimization of chemical compound structures with respect to many complex properties. To improve the compound design process, we introduce Mol-CycleGAN—a CycleGAN-based model that generates optimized compounds with high structural similarity to the original ones. Namely, given a molecule our model generates a structurally similar one with an optimized value of the considered property. We evaluate the performance of the model on selected optimization objectives related to structural properties (presence of halogen groups, number of aromatic rings) and to a physicochemical property (penalized logP). In the task of optimization of penalized logP of drug-like molecules our model significantly outperforms previous results
Mol-CycleGAN : a generative model for molecular optimization
During the drug design process, one must develop a molecule, which structure satisfies a number of physicochemical properties. To improve this process, we introduce Mol-CycleGAN – a CycleGAN-based model that generates compounds optimized for a selected property, while aiming to retain the already optimized ones. In the task of constrained optimization of penalized logP of drug-like molecules our model significantly outperforms previous results
Graph Neural Networks for Molecules
Graph neural networks (GNNs), which are capable of learning representations
from graphical data, are naturally suitable for modeling molecular systems.
This review introduces GNNs and their various applications for small organic
molecules. GNNs rely on message-passing operations, a generic yet powerful
framework, to update node features iteratively. Many researches design GNN
architectures to effectively learn topological information of 2D molecule
graphs as well as geometric information of 3D molecular systems. GNNs have been
implemented in a wide variety of molecular applications, including molecular
property prediction, molecular scoring and docking, molecular optimization and
de novo generation, molecular dynamics simulation, etc. Besides, the review
also summarizes the recent development of self-supervised learning for
molecules with GNNs.Comment: A chapter for the book "Machine Learning in Molecular Sciences". 31
pages, 4 figure
Development of Machine Learning Models for Generation and Activity Prediction of the Protein Tyrosine Kinase Inhibitors
The field of computational drug discovery and development continues to grow at a rapid pace, using generative machine learning approaches to present us with solutions to high dimensional and complex problems in drug discovery and design. In this work, we present a platform of Machine Learning based approaches for generation and scoring of novel kinase inhibitor molecules. We utilized a binary Random Forest classification model to develop a Machine Learning based scoring function to evaluate the generated molecules on Kinase Inhibition Likelihood. By training the model on several chemical features of each known kinase inhibitor, we were able to create a metric that captures the differences between a SRC Kinase Inhibitor and a non-SRC Kinase Inhibitor. We implemented the scoring function into a Biased and Unbiased Bayesian Optimization framework to generate molecules based on features of SRC Kinase Inhibitors. We then used similarity metrics such as Tanimoto Similarity to assess their closeness to that of known SRC Kinase Inhibitors. The molecules generated from this experiment demonstrated potential for belonging to the SRC Kinase Inhibitor family though chemical synthesis would be needed to confirm the results. The top molecules generated from the Unbiased and Biased Bayesian Optimization experiments were calculated to respectively have Tanimoto Similarity scores of 0.711 and 0.709 to known SRC Kinase Inhibitors. With calculated Kinase Inhibition Likelihood scores of 0.586 and 0.575, the top molecules generated from the Bayesian Optimization demonstrate a disconnect between the similarity scores to known SRC Kinase Inhibitors and the calculated Kinase Inhibition Likelihood score. It was found that implementing a bias into the Bayesian Optimization process had little effect on the quality of generated molecules. In addition, several molecules generated from the Bayesian Optimization process were sent to the School of Pharmacy for chemical synthesis which gives the experiment more concrete results. The results of this study demonstrated that generating molecules throughBayesian Optimization techniques could aid in the generation of molecules for a specific kinase family, but further expansions of the techniques would be needed for substantial results
Generative models should at least be able to design molecules that dock well : a new benchmark
Designing compounds with desired properties is a key element of the drug discovery process. However, measuring progress in the field has been challenging due to the lack of realistic retrospective benchmarks, and the large cost of prospective validation. To close this gap, we propose a benchmark based on docking, a widely used computational method for assessing molecule binding to a protein. Concretely, the goal is to generate drug-like molecules that are scored highly by SMINA, a popular docking software. We observe that various graph-based generative models fail to propose molecules with a high docking score when trained using a realistically sized training set. This suggests a limitation of the current incarnation of models for de novo drug design. Finally, we also include simpler tasks in the benchmark based on a simpler scoring function. We release the benchmark as an easy to use package available at https://github.com/cieplinski-tobiasz/smina-docking-benchmark. We hope that our benchmark will serve as a stepping stone toward the goal of automatically generating promising drug candidates
Kernel-Elastic Autoencoder for Molecular Design
We introduce the Kernel-Elastic Autoencoder (KAE), a self-supervised
generative model based on the transformer architecture with enhanced
performance for molecular design. KAE is formulated based on two novel loss
functions: modified maximum mean discrepancy and weighted reconstruction. KAE
addresses the long-standing challenge of achieving valid generation and
accurate reconstruction at the same time. KAE achieves remarkable diversity in
molecule generation while maintaining near-perfect reconstructions on the
independent testing dataset, surpassing previous molecule-generating models.
KAE enables conditional generation and allows for decoding based on beam search
resulting in state-of-the-art performance in constrained optimizations.
Furthermore, KAE can generate molecules conditional to favorable binding
affinities in docking applications as confirmed by AutoDock Vina and Glide
scores, outperforming all existing candidates from the training dataset. Beyond
molecular design, we anticipate KAE could be applied to solve problems by
generation in a wide range of applications
- …