Understanding how humans process natural language has long been a vital
research direction. The field of natural language processing (NLP) has recently
experienced a surge in the development of powerful language models. These
models have proven to be invaluable tools for studying another complex system
known to process human language: the brain. Previous studies have demonstrated
that the features of language models can be mapped to fMRI brain activity. This
raises the question: is there a commonality between information processing in
language models and the human brain? To estimate information flow patterns in a
language model, we examined the causal relationships between different layers.
Drawing inspiration from the workspace framework for consciousness, we
hypothesized that features integrating more information would more accurately
predict higher hierarchical brain activity. To validate this hypothesis, we
classified language model features into two categories based on causal network
measures: 'low in-degree' and 'high in-degree'. We subsequently compared the
brain prediction accuracy maps for these two groups. Our results reveal that
the difference in prediction accuracy follows a hierarchical pattern,
consistent with the cortical hierarchy map revealed by activity time constants.
This finding suggests a parallel between how language models and the human
brain process linguistic information.Comment: 15 pages, 16 figure