Search CORE

12 research outputs found

Do Language Models Understand Anything? On the Ability of LSTMs to Understand Negative Polarity Items

Author: Hupkes Dieuwke
Jumelet Jaap
Publication venue
Publication date: 01/01/2018
Field of study

In this paper, we attempt to link the inner workings of a neural language model to linguistic theory, focusing on a complex phenomenon well discussed in formal linguis- tics: (negative) polarity items. We briefly discuss the leading hypotheses about the licensing contexts that allow negative polarity items and evaluate to what extent a neural language model has the ability to correctly process a subset of such constructions. We show that the model finds a relation between the licensing context and the negative polarity item and appears to be aware of the scope of this context, which we extract from a parse tree of the sentence. With this research, we hope to pave the way for other studies linking formal linguistics to deep learning.Comment: Accepted to the EMNLP workshop "Analyzing and interpreting neural networks for NLP

arXiv.org e-Print Archive

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Transparency at the Source: Evaluating and Interpreting Language Models With Access to the True Distribution

Author: Jumelet Jaap
Zuidema Willem
Publication venue
Publication date: 23/10/2023
Field of study

We present a setup for training, evaluating and interpreting neural language models, that uses artificial, language-like data. The data is generated using a massive probabilistic grammar (based on state-split PCFGs), that is itself derived from a large natural language corpus, but also provides us complete control over the generative process. We describe and release both grammar and corpus, and test for the naturalness of our generated data. This approach allows us to define closed-form expressions to efficiently compute exact lower bounds on obtainable perplexity using both causal and masked language modelling. Our results show striking differences between neural language modelling architectures and training objectives in how closely they allow approximating the lower bound on perplexity. Our approach also allows us to directly compare learned representations to symbolic rules in the underlying source. We experiment with various techniques for interpreting model behaviour and learning dynamics. With access to the underlying true source, our results show striking differences and outcomes in learning dynamics between different classes of words.Comment: EMNLP Findings 202

arXiv.org e-Print Archive

Analysing Neural Language Models: Contextual Decomposition Reveals Default Reasoning in Number and Gender Assignment

Author: Hupkes Dieuwke
Jumelet Jaap
Zuidema Willem
Publication venue
Publication date: 01/01/2019
Field of study

Extensive research has recently shown that recurrent neural language models are able to process a wide range of grammatical phenomena. How these models are able to perform these remarkable feats so well, however, is still an open question. To gain more insight into what information LSTMs base their decisions on, we propose a generalisation of Contextual Decomposition (GCD). In particular, this setup enables us to accurately distil which part of a prediction stems from semantic heuristics, which part truly emanates from syntactic cues and which part arise from the model biases themselves instead. We investigate this technique on tasks pertaining to syntactic agreement and co-reference resolution and discover that the model strongly relies on a default reasoning effect to perform these tasks.Comment: To appear at CoNLL201

arXiv.org e-Print Archive

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Structural Persistence in Language Models : Priming as a Window into Abstract Language Representations

Author: Fernández Raquel
Jumelet Jaap
Sinclair Arabella
Zuidema Willem
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2022
Field of study

Acknowledgments We would like to thank the anonymous reviewers for their extensive and thoughtful feedback and suggestions, which greatly improved our work, as the action editor for his helpful guidance. We would also like to thank members of the ILLC past and present for their useful comments and feedback, specifically, Dieuwke Hupkes, Mario Giulianelli, Sandro Pezzelle, and Ece Takmaz. Arabella Sinclair worked on this project while affiliated with the University of Amsterdam. The project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 819455).Peer reviewedPublisher PD

arXiv.org e-Print Archive

Aberdeen University Research

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Attribution and Alignment : Effects of Local Context Repetition on Utterance Production and Comprehension in Dialogue

Author: Giulianelli Mario
Jumelet Jaap
Molnar Aron
Sinclair Arabella
Publication venue: Association for Computational Linguistics (ACL)
Publication date: 01/12/2023
Field of study

Funding Information: We would like to thank the anonymous reviewers for their thoughtful and useful reviews and comments. We also wish to thank Ehud Reiter for his useful comments on this work at an early stage. MG is supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 819455)

Aberdeen University Research

Attribution and Alignment: Effects of Local Context Repetition on Utterance Production and Comprehension in Dialogue

Author: Giulianelli Mario
Jumelet Jaap
Molnar Aron
Sinclair Arabella
Publication venue
Publication date: 21/11/2023
Field of study

Language models are often used as the backbone of modern dialogue systems. These models are pre-trained on large amounts of written fluent language. Repetition is typically penalised when evaluating language model generations. However, it is a key component of dialogue. Humans use local and partner specific repetitions; these are preferred by human users and lead to more successful communication in dialogue. In this study, we evaluate (a) whether language models produce human-like levels of repetition in dialogue, and (b) what are the processing mechanisms related to lexical re-use they use during comprehension. We believe that such joint analysis of model production and comprehension behaviour can inform the development of cognitively inspired dialogue generation systems.Comment: CoNLL 202

arXiv.org e-Print Archive

DecoderLens:Layerwise Interpretation of Encoder-Decoder Transformers

Author: Jumelet Jaap
Langedijk Anna
Mohebbi Hosein
Sarti Gabriele
Zuidema Willem
Publication venue: arXiv
Publication date: 05/10/2023
Field of study

In recent years, many interpretability methods have been proposed to help interpret the internal states of Transformer-models, at different levels of precision and complexity. Here, to analyze encoder-decoder Transformers, we propose a simple, new method: DecoderLens. Inspired by the LogitLens (for decoder-only Transformers), this method involves allowing the decoder to cross-attend representations of intermediate encoder layers instead of using the final encoder output, as is normally done in encoder-decoder models. The method thus maps previously uninterpretable vector representations to human-interpretable sequences of words or symbols. We report results from the DecoderLens applied to models trained on question answering, logical reasoning, speech recognition and machine translation. The DecoderLens reveals several specific subtasks that are solved at low or intermediate layers, shedding new light on the information flow inside the encoder component of this important class of models

Proceedings - University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models

Crossref

BLiMP: The Benchmark of Linguistic Minimal Pairs for English

Crossref