Recently, the remarkable capabilities of large language models (LLMs) have
been illustrated across a variety of research domains such as natural language
processing, computer vision, and molecular modeling. We extend this paradigm by
utilizing LLMs for material property prediction by introducing our model
Materials Informatics Transformer (MatInFormer). Specifically, we introduce a
novel approach that involves learning the grammar of crystallography through
the tokenization of pertinent space group information. We further illustrate
the adaptability of MatInFormer by incorporating task-specific data pertaining
to Metal-Organic Frameworks (MOFs). Through attention visualization, we uncover
the key features that the model prioritizes during property prediction. The
effectiveness of our proposed model is empirically validated across 14 distinct
datasets, hereby underscoring its potential for high throughput screening
through accurate material property prediction