DrugAssist: A Large Language Model for Molecule Optimization

Cai, Xibao; Huang, Junhong; Lai, Houtim; Liu, Wei; Wang, Longyue; Wang, Xing; Ye, Geyan; Zeng, Xiangxiang

DrugAssist: A Large Language Model for Molecule Optimization

Authors: Xibao Cai
Junhong Huang
Houtim Lai
Wei Liu
Longyue Wang
Xing Wang
Geyan Ye
Xiangxiang Zeng
Publication date: 28 December 2023
Publisher

Abstract

Recently, the impressive performance of large language models (LLMs) on a wide range of tasks has attracted an increasing number of attempts to apply LLMs in drug discovery. However, molecule optimization, a critical task in the drug discovery pipeline, is currently an area that has seen little involvement from LLMs. Most of existing approaches focus solely on capturing the underlying patterns in chemical structures provided by the data, without taking advantage of expert feedback. These non-interactive approaches overlook the fact that the drug discovery process is actually one that requires the integration of expert experience and iterative refinement. To address this gap, we propose DrugAssist, an interactive molecule optimization model which performs optimization through human-machine dialogue by leveraging LLM's strong interactivity and generalizability. DrugAssist has achieved leading results in both single and multiple property optimization, simultaneously showcasing immense potential in transferability and iterative optimization. In addition, we publicly release a large instruction-based dataset called MolOpt-Instructions for fine-tuning language models on molecule optimization tasks. We have made our code and data publicly available at https://github.com/blazerye/DrugAssist, which we hope to pave the way for future research in LLMs' application for drug discovery.Comment: Geyan Ye and Xibao Cai are equal contributors; Longyue Wang is corresponding autho

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2401.10334

Last time updated on 22/08/2024