Search CORE

26 research outputs found

ニューラル自然言語処理モデルの認知的妥当性: 言語横断的分析と談話処理

Author: Kuribayashi Tatsuki
Publication venue
Publication date: 25/03/2022
Field of study

要約のみTohoku University乾健太郎課

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

Psychometric Predictive Power of Large Language Models

Author: Baldwin Timothy
Kuribayashi Tatsuki
Oseki Yohei
Publication venue
Publication date: 13/11/2023
Field of study

Next-word probabilities from language models have been shown to successfully simulate human reading behavior. Building on this, we show that, interestingly, instruction-tuned large language models (LLMs) yield worse psychometric predictive power (PPP) for human reading behavior than base LLMs with equivalent perplexities. In other words, instruction tuning, which helps LLMs provide human-preferred responses, does not always make them human-like from the computational psycholinguistics perspective. In addition, we explore prompting methodologies in simulating human reading behavior with LLMs, showing that prompts reflecting a particular linguistic hypothesis lead LLMs to exhibit better PPP but are still worse than base LLMs. These highlight that recent instruction tuning and prompting do not offer better estimates than direct probability measurements from base LLMs in cognitive modeling.Comment: 8 page

arXiv.org e-Print Archive

Context Limitations Make Neural Language Models More Human-Like

Author: Brassard Ana
Inui Kentaro
Kuribayashi Tatsuki
Oseki Yohei
Publication venue
Publication date: 01/11/2022
Field of study

Language models (LMs) have been used in cognitive modeling as well as engineering studies -- they compute information-theoretic complexity metrics that simulate humans' cognitive load during reading. This study highlights a limitation of modern neural LMs as the model of choice for this purpose: there is a discrepancy between their context access capacities and that of humans. Our results showed that constraining the LMs' context access improved their simulation of human reading behavior. We also showed that LM-human gaps in context access were associated with specific syntactic constructions; incorporating syntactic biases into LMs' context access might enhance their cognitive plausibility.Comment: Accepted by EMNLP2022 (main long

arXiv.org e-Print Archive

Transformer Language Models Handle Word Frequency in Prediction Head

Author: Inui Kentaro
Kobayashi Goro
Kuribayashi Tatsuki
Yokoi Sho
Publication venue
Publication date: 29/05/2023
Field of study

Prediction head is a crucial component of Transformer language models. Despite its direct impact on prediction, this component has often been overlooked in analyzing Transformers. In this study, we investigate the inner workings of the prediction head, specifically focusing on bias parameters. Our experiments with BERT and GPT-2 models reveal that the biases in their word prediction heads play a significant role in the models' ability to reflect word frequency in a corpus, aligning with the logit adjustment method commonly used in long-tailed learning. We also quantify the effect of controlling the biases in practical auto-regressive text generation scenarios; under a particular setting, more diverse text can be generated without compromising text quality.Comment: 11 pages, 12 figures, accepted to ACL 2023 Findings (short paper

arXiv.org e-Print Archive

Assessing Step-by-Step Reasoning against Lexical Negation: A Case Study on Syllogism

Author: Funayama Hiroaki
Kobayashi Goro
Kuribayashi Tatsuki
Suzuki Jun
Ye Mengyu
Publication venue
Publication date: 23/10/2023
Field of study

Large language models (LLMs) take advantage of step-by-step reasoning instructions, e.g., chain-of-thought (CoT) prompting. Building on this, their ability to perform CoT-style reasoning robustly is of interest from a probing perspective. In this study, we inspect the step-by-step reasoning ability of LLMs with a focus on negation, which is a core linguistic phenomenon that is difficult to process. In particular, we introduce several controlled settings (e.g., reasoning in case of fictional entities) to evaluate the logical reasoning abilities of the models. We observed that dozens of modern LLMs were not robust against lexical negation (e.g., plausible ->implausible) when performing CoT-style reasoning, and the results highlight unique limitations in each LLM family

arXiv.org e-Print Archive

Lower Perplexity is Not Always Human-Like

Author: Asahara Masayuki
Inui Kentaro
Ito Takumi
Kuribayashi Tatsuki
Oseki Yohei
Yoshida Ryo
Publication venue
Publication date: 01/11/2022
Field of study

In computational psycholinguistics, various language models have been evaluated against human reading behavior (e.g., eye movement) to build human-like computational models. However, most previous efforts have focused almost exclusively on English, despite the recent trend towards linguistic universal within the general community. In order to fill the gap, this paper investigates whether the established results in computational psycholinguistics can be generalized across languages. Specifically, we re-examine an established generalization -- the lower perplexity a language model has, the more human-like the language model is -- in Japanese with typologically different structures from English. Our experiments demonstrate that this established generalization exhibits a surprising lack of universality; namely, lower perplexity is not always human-like. Moreover, this discrepancy between English and Japanese is further explored from the perspective of (non-)uniform information density. Overall, our results suggest that a cross-lingual evaluation will be necessary to construct human-like computational models.Comment: Accepted by ACL 202

arXiv.org e-Print Archive