24 research outputs found
Metro: Memory-Enhanced Transformer for Retrosynthetic Planning via Reaction Tree
Retrosynthetic planning plays a critical role in drug discovery and organic
chemistry. Starting from a target molecule as the root node, it aims to find a
complete reaction tree subject to the constraint that all leaf nodes belong to
a set of starting materials. The multi-step reactions are crucial because they
determine the flow chart in the production of the Organic Chemical Industry.
However, existing datasets lack curation of tree-structured multi-step
reactions, and fail to provide such reaction trees, limiting models'
understanding of organic molecule transformations. In this work, we first
develop a benchmark curated for the retrosynthetic planning task, which
consists of 124,869 reaction trees retrieved from the public USPTO-full
dataset. On top of that, we propose Metro: Memory-Enhanced Transformer for
RetrOsynthetic planning. Specifically, the dependency among molecules in the
reaction tree is captured as context information for multi-step retrosynthesis
predictions through transformers with a memory module. Extensive experiments
show that Metro dramatically outperforms existing single-step retrosynthesis
models by at least 10.7% in top-1 accuracy. The experiments demonstrate the
superiority of exploiting context information in the retrosynthetic planning
task. Moreover, the proposed model can be directly used for synthetic
accessibility analysis, as it is trained on reaction trees with the shortest
depths. Our work is the first step towards a brand new formulation for
retrosynthetic planning in the aspects of data construction, model design, and
evaluation. Code is available at https://github.com/SongtaoLiu0823/metro