38 research outputs found
Learning Multimodal Graph-to-Graph Translation for Molecular Optimization
We view molecular optimization as a graph-to-graph translation problem. The
goal is to learn to map from one molecular graph to another with better
properties based on an available corpus of paired molecules. Since molecules
can be optimized in different ways, there are multiple viable translations for
each input graph. A key challenge is therefore to model diverse translation
outputs. Our primary contributions include a junction tree encoder-decoder for
learning diverse graph translations along with a novel adversarial training
method for aligning distributions of molecules. Diverse output distributions in
our model are explicitly realized by low-dimensional latent vectors that
modulate the translation process. We evaluate our model on multiple molecular
optimization tasks and show that our model outperforms previous
state-of-the-art baselines
Junction Tree Variational Autoencoder for Molecular Graph Generation
We seek to automate the design of molecules based on specific chemical
properties. In computational terms, this task involves continuous embedding and
generation of molecular graphs. Our primary contribution is the direct
realization of molecular graphs, a task previously approached by generating
linear SMILES strings instead of graphs. Our junction tree variational
autoencoder generates molecular graphs in two phases, by first generating a
tree-structured scaffold over chemical substructures, and then combining them
into a molecule with a graph message passing network. This approach allows us
to incrementally expand molecules while maintaining chemical validity at every
step. We evaluate our model on multiple tasks ranging from molecular generation
to optimization. Across these tasks, our model outperforms previous
state-of-the-art baselines by a significant margin
Unsupervised Protein-Ligand Binding Energy Prediction via Neural Euler's Rotation Equation
Protein-ligand binding prediction is a fundamental problem in AI-driven drug
discovery. Prior work focused on supervised learning methods using a large set
of binding affinity data for small molecules, but it is hard to apply the same
strategy to other drug classes like antibodies as labelled data is limited. In
this paper, we explore unsupervised approaches and reformulate binding energy
prediction as a generative modeling task. Specifically, we train an
energy-based model on a set of unlabelled protein-ligand complexes using SE(3)
denoising score matching and interpret its log-likelihood as binding affinity.
Our key contribution is a new equivariant rotation prediction network called
Neural Euler's Rotation Equations (NERE) for SE(3) score matching. It predicts
a rotation by modeling the force and torque between protein and ligand atoms,
where the force is defined as the gradient of an energy function with respect
to atom coordinates. We evaluate NERE on protein-ligand and antibody-antigen
binding affinity prediction benchmarks. Our model outperforms all unsupervised
baselines (physics-based and statistical potentials) and matches supervised
learning methods in the antibody case