The rapid evolution of artificial intelligence in drug discovery encounters
challenges with generalization and extensive training, yet Large Language
Models (LLMs) offer promise in reshaping interactions with complex molecular
data. Our novel contribution, InstructMol, a multi-modal LLM, effectively
aligns molecular structures with natural language via an instruction-tuning
approach, utilizing a two-stage training strategy that adeptly combines limited
domain-specific data with molecular and textual information. InstructMol
showcases substantial performance improvements in drug discovery-related
molecular tasks, surpassing leading LLMs and significantly reducing the gap
with specialized models, thereby establishing a robust foundation for a
versatile and dependable drug discovery assistant