Search CORE

153 research outputs found

Ramifications of Phonology-Syntax Interactions for Phonological Models

Author: Bergen Benjamin K.
Publication venue: ScholarlyCommons
Publication date: 01/01/2000
Field of study

Emergent inabilities? Inverse scaling over the course of pretraining

Author: Bergen Benjamin K.
Michaelov James A.
Publication venue
Publication date: 15/11/2023
Field of study

Does inverse scaling only occur as a function of model size, or can it also occur over the course of training? We carry out an exploratory study investigating whether the performance of language models on specific tasks can decrease (while general performance remains high) during training on the language modeling task. We find 8 tasks on which Pythia 12B (Biderman et al., 2023) shows decreased performance over the course of training. Five of these tasks (TruthfulQA-MC1, TruthfulQA-MC2, Hindsight Neglect, Memo Trap, and Pattern Match Suppression) additionally show a consistent relationship whereby larger language models show a greater decrease in performance the more they are trained, despite showing standard (positive) scaling overall. This highlights the importance of testing performance at all relevant benchmarks any time models are trained on additional data, even if their overall performance improvesComment: Accepted to Findings of EMNLP 202

arXiv.org e-Print Archive

Do language models make human-like predictions about the coreferents of Italian anaphoric zero pronouns?

Author: Bergen Benjamin K.
Michaelov James A.
Publication venue
Publication date: 03/10/2022
Field of study

Some languages allow arguments to be omitted in certain contexts. Yet human language comprehenders reliably infer the intended referents of these zero pronouns, in part because they construct expectations about which referents are more likely. We ask whether Neural Language Models also extract the same expectations. We test whether 12 contemporary language models display expectations that reflect human behavior when exposed to sentences with zero pronouns from five behavioral experiments conducted in Italian by Carminati (2005). We find that three models - XGLM 2.9B, 4.5B, and 7.5B - capture the human behavior from all the experiments, with others successfully modeling some of the results. This result suggests that human expectations about coreference can be derived from exposure to language, and also indicates features of language models that allow them to better reflect human behavior.Comment: Accepted at COLING 202

arXiv.org e-Print Archive

The cognitive linguistics enterprise: an overview

Author: Bergen Benjamin K.
Evans Vyvyan
Zinken Jörg
Publication venue: London [u.a.] : Equinox
Publication date: 22/04/2015
Field of study

Publikationsserver des Instituts für Deutsche Sprache

Probability in Phonological Generalizations: Modeling French Optional Final Consonants

Author: Bergen Benjamin K.
Publication venue: 'Linguistic Society of America'
Publication date: 25/09/2000
Field of study

Proceedings of the Twenty-Sixth Annual Meeting of the Berkeley Linguistics Society: General Session and Parasession on Aspect (2000

Proceedings Published by the LSA (Linguistic Society of America)

Crossref

A Bit of a Problem: Measurement Disparities in Dataset Sizes Across Languages

Author: Arnett Catherine
Bergen Benjamin K.
Chang Tyler A.
Publication venue
Publication date: 01/03/2024
Field of study

How should text dataset sizes be compared across languages? Even for content-matched (parallel) corpora, UTF-8 encoded text can require a dramatically different number of bytes for different languages. In our work, we define the byte premium between two languages as the ratio of bytes used to encode content-matched text in those languages. We compute byte premiums for 1155 languages, and we use linear regressions to estimate byte premiums for other languages. We release a tool to obtain byte premiums for any two languages, enabling comparisons of dataset sizes across languages for more equitable multilingual model development and data practices

arXiv.org e-Print Archive

Can Peanuts Fall in Love with Distributional Semantics?

Author: Bergen Benjamin K.
Coulson Seana
Michaelov James A.
Publication venue
Publication date: 01/01/2023
Field of study

The context in which a sentence appears can drastically alter our expectations about upcoming words - for example, following a short story involving an anthropomorphic peanut, experimental participants are more likely to expect the sentence 'the peanut was in love' than 'the peanut was salted', as indexed by N400 amplitude (Nieuwland & van Berkum, 2006). This rapid and dynamic updating of comprehenders' expectations about the kind of events that a peanut may take part in based on context has been explained using the construct of Situation Models - updated mental representations of key elements of an event under discussion, in this case, the peanut protagonist. However, recent work showing that N400 amplitude can be predicted based on distributional information alone raises the question whether situation models are in fact necessary for the kinds of contextual effects observed in previous work. To investigate this question, we attempt to model the results of Nieuwland and van Berkum (2006) using six computational language models and three sets of word vectors, none of which have explicit situation models or semantic grounding. We find that the effect found by Nieuwland and van Berkum (2006) can be fully modeled by two language models and two sets of word vectors, with others showing a reduced effect. Thus, at least some processing effects normally explained through situation models may not in fact require explicit situation models

arXiv.org e-Print Archive

eScholarship - University of California

When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages

Author: Arnett Catherine
Bergen Benjamin K.
Chang Tyler A.
Tu Zhuowen
Publication venue
Publication date: 15/11/2023
Field of study

Multilingual language models are widely used to extend NLP systems to low-resource languages. However, concrete evidence for the effects of multilinguality on language modeling performance in individual languages remains scarce. Here, we pre-train over 10,000 monolingual and multilingual language models for over 250 languages, including multiple language families that are under-studied in NLP. We assess how language modeling performance in each language varies as a function of (1) monolingual dataset size, (2) added multilingual dataset size, (3) linguistic similarity of the added languages, and (4) model size (up to 45M parameters). We find that in moderation, adding multilingual data improves low-resource language modeling performance, similar to increasing low-resource dataset sizes by up to 33%. Improvements depend on the syntactic similarity of the added multilingual data, with marginal additional effects of vocabulary overlap. However, high-resource languages consistently perform worse in multilingual pre-training scenarios. As dataset sizes increase, adding multilingual data begins to hurt performance for both low-resource and high-resource languages, likely due to limited model capacity (the "curse of multilinguality"). These results suggest that massively multilingual pre-training may not be optimal for any languages involved, but that more targeted models can significantly improve performance

arXiv.org e-Print Archive

Recommended from our members

Left-right mental timeline is robust to visuospatial and verbal interference

Author: Bergen Benjamin K.
Boroditsky Lera
Hendricks Rose K.
Núñez Rafael
Walker Esther J.
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

We test the robustness of American college students’ mentaltimeline to dual tasks that have interfered with spatial andverbal reasoning in prior work. We focus on the left-right axisfor representing sequences of events. We test Americancollege students, who read from left to right. We test forautomatic space-time mappings using two established space-time association tasks. We find that their tendency toassociate earlier events with the left side of space and laterevents with the right remains under conditions of visuospatialand verbal interference. We find this both when participantsmade time judgments about linguistic and non-linguisticstimuli. We discuss the relationship between these results andthose obtained for mental timelines that result from learningnew metaphors in language (Hendricks & Boroditsky, 2015),and the effects of the same interference tasks on number tasks(mental number-line and counting; van Dijck et al., 2009;Frank et al., 2012)

eScholarship - University of California