296,880 research outputs found
Retrieval-based Controllable Molecule Generation
Generating new molecules with specified chemical and biological properties
via generative models has emerged as a promising direction for drug discovery.
However, existing methods require extensive training/fine-tuning with a large
dataset, often unavailable in real-world generation tasks. In this work, we
propose a new retrieval-based framework for controllable molecule generation.
We use a small set of exemplar molecules, i.e., those that (partially) satisfy
the design criteria, to steer the pre-trained generative model towards
synthesizing molecules that satisfy the given design criteria. We design a
retrieval mechanism that retrieves and fuses the exemplar molecules with the
input molecule, which is trained by a new self-supervised objective that
predicts the nearest neighbor of the input molecule. We also propose an
iterative refinement process to dynamically update the generated molecules and
retrieval database for better generalization. Our approach is agnostic to the
choice of generative models and requires no task-specific fine-tuning. On
various tasks ranging from simple design criteria to a challenging real-world
scenario for designing lead compounds that bind to the SARS-CoV-2 main
protease, we demonstrate our approach extrapolates well beyond the retrieval
database, and achieves better performance and wider applicability than previous
methods.Comment: 29 page
PrefixMol: Target- and Chemistry-aware Molecule Design via Prefix Embedding
Is there a unified model for generating molecules considering different
conditions, such as binding pockets and chemical properties? Although
target-aware generative models have made significant advances in drug design,
they do not consider chemistry conditions and cannot guarantee the desired
chemical properties. Unfortunately, merging the target-aware and chemical-aware
models into a unified model to meet customized requirements may lead to the
problem of negative transfer. Inspired by the success of multi-task learning in
the NLP area, we use prefix embeddings to provide a novel generative model that
considers both the targeted pocket's circumstances and a variety of chemical
properties. All conditional information is represented as learnable features,
which the generative model subsequently employs as a contextual prompt.
Experiments show that our model exhibits good controllability in both single
and multi-conditional molecular generation. The controllability enables us to
outperform previous structure-based drug design methods. More interestingly, we
open up the attention mechanism and reveal coupling relationships between
conditions, providing guidance for multi-conditional molecule generation
Recommended from our members
Quantitative surface field analysis: learning causal models to predict ligand binding affinity and pose.
We introduce the QuanSA method for inducing physically meaningful field-based models of ligand binding pockets based on structure-activity data alone. The method is closely related to the QMOD approach, substituting a learned scoring field for a pocket constructed of molecular fragments. The problem of mutual ligand alignment is addressed in a general way, and optimal model parameters and ligand poses are identified through multiple-instance machine learning. We provide algorithmic details along with performance results on sixteen structure-activity data sets covering many pharmaceutically relevant targets. In particular, we show how models initially induced from small data sets can extrapolatively identify potent new ligands with novel underlying scaffolds with very high specificity. Further, we show that combining predictions from QuanSA models with those from physics-based simulation approaches is synergistic. QuanSA predictions yield binding affinities, explicit estimates of ligand strain, associated ligand pose families, and estimates of structural novelty and confidence. The method is applicable for fine-grained lead optimization as well as potent new lead identification
Mol-CycleGAN - a generative model for molecular optimization
Designing a molecule with desired properties is one of the biggest challenges
in drug development, as it requires optimization of chemical compound
structures with respect to many complex properties. To augment the compound
design process we introduce Mol-CycleGAN - a CycleGAN-based model that
generates optimized compounds with high structural similarity to the original
ones. Namely, given a molecule our model generates a structurally similar one
with an optimized value of the considered property. We evaluate the performance
of the model on selected optimization objectives related to structural
properties (presence of halogen groups, number of aromatic rings) and to a
physicochemical property (penalized logP). In the task of optimization of
penalized logP of drug-like molecules our model significantly outperforms
previous results
- …