Search CORE

14 research outputs found

Spectral Probing

Author: Müller-Eberstein Max
Plank Barbara
van der Goot Rob
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/12/2022
Field of study

The IT University of Copenhagen's Repository

Subspace Chronicles: How Linguistic Information Emerges, Shifts and Interacts during Language Model Training

Author: Müller-Eberstein Max
Plank Barbara
Titov Ivan
van der Goot Rob
Publication venue
Publication date: 25/10/2023
Field of study

Representational spaces learned via language modeling are fundamental to Natural Language Processing (NLP), however there has been limited understanding regarding how and when during training various types of linguistic information emerge and interact. Leveraging a novel information theoretic probing suite, which enables direct comparisons of not just task performance, but their representational subspaces, we analyze nine tasks covering syntax, semantics and reasoning, across 2M pre-training steps and five seeds. We identify critical learning phases across tasks and time, during which subspaces emerge, share information, and later disentangle to specialize. Across these phases, syntactic knowledge is acquired rapidly after 0.5% of full training. Continued performance improvements primarily stem from the acquisition of open-domain knowledge, while semantics and reasoning tasks benefit from later boosts to long-range contextualization and higher specialization. Measuring cross-task similarity further reveals that linguistically related tasks share information throughout training, and do so more during the critical phase of learning than before or after. Our findings have implications for model interpretability, multi-task learning, and learning from limited data.Comment: Accepted at EMNLP 2023 (Findings

arXiv.org e-Print Archive

Establishing Trustworthiness: Rethinking Tasks and Model Evaluation

Author: Litschko Robert
Müller-Eberstein Max
Plank Barbara
van der Goot Rob
Weber Leon
Publication venue
Publication date: 23/10/2023
Field of study

Language understanding is a multi-faceted cognitive capability, which the Natural Language Processing (NLP) community has striven to model computationally for decades. Traditionally, facets of linguistic intelligence have been compartmentalized into tasks with specialized model architectures and corresponding evaluation protocols. With the advent of large language models (LLMs) the community has witnessed a dramatic shift towards general purpose, task-agnostic approaches powered by generative models. As a consequence, the traditional compartmentalized notion of language tasks is breaking down, followed by an increasing challenge for evaluation and analysis. At the same time, LLMs are being deployed in more real-world scenarios, including previously unforeseen zero-shot setups, increasing the need for trustworthy and reliable systems. Therefore, we argue that it is time to rethink what constitutes tasks and model evaluation in NLP, and pursue a more holistic view on language, placing trustworthiness at the center. Towards this goal, we review existing compartmentalized approaches for understanding the origins of a model's functional capacity, and provide recommendations for more multi-faceted evaluation protocols.Comment: Accepted at EMNLP 2023 (Main Conference), camera-read

arXiv.org e-Print Archive

On Language Spaces, Scales and Cross-Lingual Transfer of UD Parsers

Author: Gutierrez-Vasques Ximena
Müller-Eberstein Max
Pelloni Olga
Plank Barbara
Samardžić Tanja
van der Goot Rob
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/12/2023
Field of study

The IT University of Copenhagen's Repository

Experimental Standards for Deep Learning Research: A Natural Language Processing Perspective

Author: Bassignana Elisa
Hardmeier Christian
Müller-Eberstein Max
Plank Barbara
Ulmer Dennis
van der Goot Rob
Varab Daniel
Zhang Mike
Publication venue
Publication date: 01/01/2022
Field of study

The field of Deep Learning (DL) has undergone explosive growth during the last decade, with a substantial impact on Natural Language Processing (NLP) as well. Yet, compared to more established disciplines, a lack of common experimental standards remains an open challenge to the field at large. Starting from fundamental scientific principles, we distill ongoing discussions on experimental standards in NLP into a single, widely-applicable methodology. Following these best practices is crucial to strengthen experimental evidence, improve reproducibility and support scientific progress. These standards are further collected in a public repository to help them transparently adapt to future needs

arXiv.org e-Print Archive

The IT University of Copenhagen's Repository

VBN

Probing for Labeled Dependency Trees

Author: Müller-Eberstein Max
Plank Barbara
van der Goot Rob
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2022
Field of study

Probing has become an important tool for analyzing representations in Natural Language Processing (NLP). For graphical NLP tasks such as dependency parsing, linear probes are currently limited to extracting undirected or unlabeled parse trees which do not capture the full task. This work introduces DepProbe, a linear probe which can extract labeled and directed dependency parse trees from embeddings while using fewer parameters and compute than prior methods. Leveraging its full task coverage and lightweight parametrization, we investigate its predictive power for selecting the best transfer language for training a full biaffine attention parser. Across 13 languages, our proposed method identifies the best source treebank 94% of the time, outperforming competitive baselines and prior work. Finally, we analyze the informativeness of task-specific subspaces in contextual embeddings as well as which benefits a full parser's non-linear parametrization provides.Comment: Accepted at ACL 2022 (Main Conference

arXiv.org e-Print Archive

The IT University of Copenhagen's Repository

Sort by Structure: Language Model Ranking as Dependency Probing

Author: Müller-Eberstein Max
Plank Barbara
van der Goot Rob
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2022
Field of study

The IT University of Copenhagen's Repository

Frustratingly Easy Performance Improvements for Low-resource Setups: A Tale on BERT and Segment Embeddings

Author: Müller-Eberstein Max
Plank Barbara
van der Goot Rob
Publication venue
Publication date: 01/01/2022
Field of study

The IT University of Copenhagen's Repository

Evidence > Intuition: Transferability Estimation for Encoder Selection

Author: Bassignana Elisa
Müller-Eberstein Max
Plank Barbara
Zhang Mike
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 07/12/2022
Field of study

The IT University of Copenhagen's Repository

Sort by Structure: Language Model Ranking as Dependency Probing

Author: Müller-Eberstein Max
Plank Barbara
van der Goot Rob
Publication venue
Publication date: 01/01/2022
Field of study

Making an informed choice of pre-trained language model (LM) is critical for performance, yet environmentally costly, and as such widely underexplored. The field of Computer Vision has begun to tackle encoder ranking, with promising forays into Natural Language Processing, however they lack coverage of linguistic tasks such as structured prediction. We propose probing to rank LMs, specifically for parsing dependencies in a given language, by measuring the degree to which labeled trees are recoverable from an LM's contextualized embeddings. Across 46 typologically and architecturally diverse LM-language pairs, our probing approach predicts the best LM choice 79% of the time using orders of magnitude less compute than training a full parser. Within this study, we identify and analyze one recently proposed decoupled LM - RemBERT - and find it strikingly contains less inherent dependency information, but often yields the best parser after full fine-tuning. Without this outlier our approach identifies the best LM in 89% of cases.Comment: Accepted at NAACL 2022 (Main Conference

arXiv.org e-Print Archive

The IT University of Copenhagen's Repository