Quantum state tomography is an elementary tool to fully characterize an
unknown quantum state. As the quantum hardware scales up in size, the standard
quantum state tomography becomes increasingly challenging due to its
exponentially growing complexity. In this work, we propose a scalable solution
by considering state tomography as a language modeling task, where the unknown
quantum state is treated as an unknown language, the correlation of the quantum
state is interpreted as the semantic information specific to this language, and
the measurement outcomes are simply the text instances generated from the
language. Based on a customized transformer model from language modeling, we
demonstrate that our method can accurately reconstruct prototypical pure and
mixed quantum states using less samples than state-of-the-art methods. More
importantly, our method can reconstruct a class of similar states
simultaneously, in comparison with the existing neural network methods that
need to train a model for each unknown state