263 research outputs found
Psychometric Predictive Power of Large Language Models
Next-word probabilities from language models have been shown to successfully
simulate human reading behavior. Building on this, we show that, interestingly,
instruction-tuned large language models (LLMs) yield worse psychometric
predictive power (PPP) for human reading behavior than base LLMs with
equivalent perplexities. In other words, instruction tuning, which helps LLMs
provide human-preferred responses, does not always make them human-like from
the computational psycholinguistics perspective. In addition, we explore
prompting methodologies in simulating human reading behavior with LLMs, showing
that prompts reflecting a particular linguistic hypothesis lead LLMs to exhibit
better PPP but are still worse than base LLMs. These highlight that recent
instruction tuning and prompting do not offer better estimates than direct
probability measurements from base LLMs in cognitive modeling.Comment: 8 page
Can Language Models Be Tricked by Language Illusions? Easier with Syntax, Harder with Semantics
Language models (LMs) have been argued to overlap substantially with human
beings in grammaticality judgment tasks. But when humans systematically make
errors in language processing, should we expect LMs to behave like cognitive
models of language and mimic human behavior? We answer this question by
investigating LMs' more subtle judgments associated with "language illusions"
-- sentences that are vague in meaning, implausible, or ungrammatical but
receive unexpectedly high acceptability judgments by humans. We looked at three
illusions: the comparative illusion (e.g. "More people have been to Russia than
I have"), the depth-charge illusion (e.g. "No head injury is too trivial to be
ignored"), and the negative polarity item (NPI) illusion (e.g. "The hunter who
no villager believed to be trustworthy will ever shoot a bear"). We found that
probabilities represented by LMs were more likely to align with human judgments
of being "tricked" by the NPI illusion which examines a structural dependency,
compared to the comparative and the depth-charge illusions which require
sophisticated semantic understanding. No single LM or metric yielded results
that are entirely consistent with human behavior. Ultimately, we show that LMs
are limited both in their construal as cognitive models of human language
processing and in their capacity to recognize nuanced but critical information
in complicated language materials.Comment: Accepted by The SIGNLL Conference on Computational Natural Language
Learning 202
A Cross-Linguistic Pressure for Uniform Information Density in Word Order
While natural languages differ widely in both canonical word order and word
order flexibility, their word orders still follow shared cross-linguistic
statistical patterns, often attributed to functional pressures. In the effort
to identify these pressures, prior work has compared real and counterfactual
word orders. Yet one functional pressure has been overlooked in such
investigations: the uniform information density (UID) hypothesis, which holds
that information should be spread evenly throughout an utterance. Here, we ask
whether a pressure for UID may have influenced word order patterns
cross-linguistically. To this end, we use computational models to test whether
real orders lead to greater information uniformity than counterfactual orders.
In our empirical study of 10 typologically diverse languages, we find that: (i)
among SVO languages, real word orders consistently have greater uniformity than
reverse word orders, and (ii) only linguistically implausible counterfactual
orders consistently exceed the uniformity of real orders. These findings are
compatible with a pressure for information uniformity in the development and
usage of natural languages
Recommended from our members
Crosslinguistic Word Orders Enable an Efficient Tradeoff of Memory and Surprisal
Neural models of language use:Studies of language comprehension and production in context
Artificial neural network models of language are mostly known and appreciated today for providing a backbone for formidable AI technologies. This thesis takes a different perspective. Through a series of studies on language comprehension and production, it investigates whether artificial neural networks—beyond being useful in countless AI applications—can serve as accurate computational simulations of human language use, and thus as a new core methodology for the language sciences
Context Limitations Make Neural Language Models More Human-Like
Language models (LMs) have been used in cognitive modeling as well as
engineering studies -- they compute information-theoretic complexity metrics
that simulate humans' cognitive load during reading. This study highlights a
limitation of modern neural LMs as the model of choice for this purpose: there
is a discrepancy between their context access capacities and that of humans.
Our results showed that constraining the LMs' context access improved their
simulation of human reading behavior. We also showed that LM-human gaps in
context access were associated with specific syntactic constructions;
incorporating syntactic biases into LMs' context access might enhance their
cognitive plausibility.Comment: Accepted by EMNLP2022 (main long
- …