Search CORE

19 research outputs found

Which Presuppositions are Subject to Contextual Felicity Constraints?

Author: Wilcox Ethan Gotlieb
Publication venue: 'Linguistic Society of America'
Publication date: 31/12/2021
Field of study

Some sentences with presupposition triggers can be felicitously uttered when their presuppositions are not entailed by the context, whereas others are infelicitous in such environments, a phenomenon known as Missing Accommodation / Informative Presupposition or varying Contextual Felicity Constraints (CFCs). Despite an abundance of recent quantitative work on presuppositions, this aspect of their behavior has received less attention via experimentation. Here, we present the results from a semantic rating study testing the relative CFC strength of thirteen presupposition triggers, making this the largest cross-trigger comparison reported in the literature to date. The results support a three-way categorical analysis of presupposition triggers, based on imposing strong, weak, or no CFCs. We observe that strong CFC triggers are all focus-associating, suggesting that (at least some of the) variation in behavior arises due to naturally-occurring semantic classes. We compare our results to three previous proposals for CFC variation and argue that none yet account for the full empirical picture

Proceedings Published by the LSA (Linguistic Society of America)

Testing the Predictions of Surprisal Theory in 11 Languages

Author: Cotterell Ryan
Levy Roger P.
Meister Clara
Pimentel Tiago
Wilcox Ethan Gotlieb
Publication venue
Publication date: 10/07/2023
Field of study

A fundamental result in psycholinguistics is that less predictable words take a longer time to process. One theoretical explanation for this finding is Surprisal Theory (Hale, 2001; Levy, 2008), which quantifies a word's predictability as its surprisal, i.e. its negative log-probability given a context. While evidence supporting the predictions of Surprisal Theory have been replicated widely, most have focused on a very narrow slice of data: native English speakers reading English texts. Indeed, no comprehensive multilingual analysis exists. We address this gap in the current literature by investigating the relationship between surprisal and reading times in eleven different languages, distributed across five language families. Deriving estimates from language models trained on monolingual and multilingual corpora, we test three predictions associated with surprisal theory: (i) whether surprisal is predictive of reading times; (ii) whether expected surprisal, i.e. contextual entropy, is predictive of reading times; (iii) and whether the linking function between surprisal and reading times is linear. We find that all three predictions are borne out crosslinguistically. By focusing on a more diverse set of languages, we argue that these results offer the most robust link to-date between information theory and incremental language processing across languages.Comment: This is a pre-MIT Press publication version of the pape

arXiv.org e-Print Archive

Which Presuppositions are Subject to Contextual Felicity Constraints?

Author: Ethan Gotlieb Wilcox
Publication venue: 'Center for Open Science'
Publication date: 07/07/2021
Field of study

OSF Preprints

Presupposition and Accommodation

Author: Ethan Gotlieb Wilcox
Publication venue: 'Center for Open Science'
Publication date: 06/05/2022
Field of study

Materials, data and scripts for Ethan Gotlieb Wilcox's dissertatio

OSF Preprints

Recommended from our members

Using the Interpolated Maze Task to Assess Incremental Processing in English Relative Clauses

Author: Levy Roger
Vani Pranali
Wilcox Ethan Gotlieb
Publication venue: eScholarship, University of California
Publication date: 01/01/2021
Field of study

In English, Subject Relative Clauses are processed more quickly than Object Relative Clauses, but open questions remain about where in the clause slowdown occurs. The surprisal theory of incremental processing, under which processing difficulty corresponds to probabilistic expectations about upcoming material, predicts that slowdown should occur immediately on material that disambiguates the subject from object relative clause. However, evidence from eye tracking and self-paced reading studies suggests that slowdown occurs downstream of RC-disambiguating material, on the relative clause verb. These methods, however, suffer from well-known spillover effects which makes their results difficult to interpret. To address these issues, we introduce and deploy a novel variant of the Maze task for reading times (Forster, Guerrera, & Elliot, 2009), called the Interpolated Maze in two English web-based experiments. In Experiment 1, we find that the locus of reading-time differences between SRCs and ORCs falls on immediate disambiguating definite determiner. Experiment 2 provides a control, showing that ORCs are read more slowly than lexically-matching, non-anomalous material. These results provide new evidence for the locus of processing difficulty in relative clauses and support the surprisal theory of incremental processing

eScholarship - University of California

Mouse Tracking for Reading (MoTR)

Author: Cui Ding
Ethan Gotlieb Wilcox
Lena A. Jäger
Publication venue: OSF
Publication date: 26/09/2023
Field of study

This OSF repository contains the raw data and Bayesian models for provo experiment and attachment preference experiment implemented with Mouse tracking for Reading (MoTR) MoTR runs in the browser, enabling cheaper data collection, and collection in places where no eye-tracking equipment is available. In a MoTR trial participants are presented with text that is blurred, except for a small spotlight region around the tip of the mouse. Participants move the mouse over the text, bringing individual words into focus in order to read. Mouse movement is recorded, and can be analyzed similarly to eye-tracking data. We implement MoTR experiments in Magpie and validate it in two suites of experiments. First, we conduct a cross-methodological replication of three experiments from Witzel et al., 2012 and Boyce et al., 2020 that test preference for high vs. low attachment. Second, we record MoTR data for the Provo Corpus, for which eye-tracking data exists. We find strong correlations between eye-tracking and MoTR reading times (RTs) from 0.67-0.78. In an analysis similar to Smith and Levy (2013), we find a linear effect on MoTR RTs of by-word surprisal (estimated from GPT-2). MoTR RTs replicate previous self-paced reading results and novelly reveal how regressions are implicated in the processing of these phenomena

OSF Preprints

A Targeted Assessment of Incremental Processing in Neural LanguageModels and Humans

Author: Levy Roger P.
Vani Pranali
Wilcox Ethan Gotlieb
Publication venue
Publication date: 06/06/2021
Field of study

We present a targeted, scaled-up comparison of incremental processing in humans and neural language models by collecting by-word reaction time data for sixteen different syntactic test suites across a range of structural phenomena. Human reaction time data comes from a novel online experimental paradigm called the Interpolated Maze task. We compare human reaction times to by-word probabilities for four contemporary language models, with different architectures and trained on a range of data set sizes. We find that across many phenomena, both humans and language models show increased processing difficulty in ungrammatical sentence regions with human and model `accuracy' scores (a la Marvin and Linzen(2018)) about equal. However, although language model outputs match humans in direction, we show that models systematically under-predict the difference in magnitude of incremental processing difficulty between grammatical and ungrammatical sentences. Specifically, when models encounter syntactic violations they fail to accurately predict the longer reaction times observed in the human data. These results call into question whether contemporary language models are approaching human-like performance for sensitivity to syntactic violations.Comment: To appear at ACL 202

arXiv.org e-Print Archive

DSpace@MIT

An Information-Theoretic Analysis of Targeted Regressions during Reading

Author: Clara Meister
Ethan Gotlieb Wilcox
Ryan Cotterell
Tiago Pimentel
Publication venue: 'Center for Open Science'
Publication date: 01/03/2024
Field of study

Regressions, or backward saccades, are common during reading, accounting for between 5% and 20% of all saccades. And yet, relatively little is known about what causes them. We provide an information-theoretic operationalization for two previous qualitative hypotheses about regressions, which we dub reactivation and reanalysis. We argue that these hypotheses make different predictions about the pointwise mutual information or pmi between a regression’s source and target. Intuitively, the pmi between two words measures how much more (or less) likely one word is to be present given the other. On one hand, the reactivation hypothesis predicts that regressions occur between words that are associated, implying high positive values of pmi. On the other hand, the reanalysis hypothesis predicts that regressions should occur between words that are disassociated with each other, implying negative, low values of pmi. As a second theoretical contribution, we expand on previous theories by considering not only pmi but also expected values of pmi, E[pmi], where the expectation is taken over all possible realizations of the regression’s target. The rationale for this is that language processing involves making inferences under uncertainty, and readers may be uncertain about what they have read, especially if a previous word was skipped. To test both theories, we use contemporary language models to estimate pmi-based statistics over word pairs in three corpora of eye tracking data in English, as well as in six languages across three language families (Indo-European, Uralic, and Turkic). Our results are consistent across languages and models tested: Positive values of pmi and E[pmi] consistently help to predict the patterns of regressions during reading, whereas negative values of pmi and E[pmi] do not. Our information-theoretic interpretation increases the predictive scope of both theories and our studies present the first systematic crosslinguistic analysis of regressions in the literature. Our results support the reactivation hypothesis and, more broadly, they expand the number of language processing behaviors that can be linked to information-theoretic principles

PsyArxiv

Recommended from our members

How does dependency type mediate gender agreement in Russian?

Author: Ding Cui
Fuchs Zuzanna
Oguz Metehan
Wilcox Ethan Gotlieb
Publication venue: eScholarship, University of California
Publication date: 01/01/2024
Field of study

Natural languages often exhibit agreement, where two words must be matched for certain features. It’s well known that people use knowledge about agreement to drive expectations during online processing. What is less well known is how the type of dependency mediates this expectation and thus the processing difficulty of a gender-mismatched word. To test this, we collect incremental processing data on three types of gender agreement mismatches in Russian: (i) past-tense verbs and subjects, (ii) attributive adjectives and nouns, (iii) predicate adjectives and nouns. We collect two types of incremental processing data: eye-tracking and Mouse-Tracking-for-Reading (MoTR), in which a participant reveals and reads text by moving their mouse, whose position is recorded. We find that while participants are surprised by ungrammatical conditions, this is mediated both by the type of agreement as well as the gender of the agreeing noun

eScholarship - University of California