Protein research is crucial in various fundamental disciplines, but
understanding their intricate structure-function relationships remains
challenging. Recent Large Language Models (LLMs) have made significant strides
in comprehending task-specific knowledge, suggesting the potential for
ChatGPT-like systems specialized in protein to facilitate basic research. In
this work, we introduce ProtChatGPT, which aims at learning and understanding
protein structures via natural languages. ProtChatGPT enables users to upload
proteins, ask questions, and engage in interactive conversations to produce
comprehensive answers. The system comprises protein encoders, a
Protein-Language Pertaining Transformer (PLP-former), a projection adapter, and
an LLM. The protein first undergoes protein encoders and PLP-former to produce
protein embeddings, which are then projected by the adapter to conform with the
LLM. The LLM finally combines user questions with projected embeddings to
generate informative answers. Experiments show that ProtChatGPT can produce
promising responses to proteins and their corresponding questions. We hope that
ProtChatGPT could form the basis for further exploration and application in
protein research. Code and our pre-trained model will be publicly available