4 research outputs found
Substitution-based Semantic Change Detection using Contextual Embeddings
Measuring semantic change has thus far remained a task where methods using
contextual embeddings have struggled to improve upon simpler techniques relying
only on static word vectors. Moreover, many of the previously proposed
approaches suffer from downsides related to scalability and ease of
interpretation. We present a simplified approach to measuring semantic change
using contextual embeddings, relying only on the most probable substitutes for
masked terms. Not only is this approach directly interpretable, it is also far
more efficient in terms of storage, achieves superior average performance
across the most frequently cited datasets for this task, and allows for more
nuanced investigation of change than is possible with static word vectors
Interpretable Word Sense Representations via Definition Generation: The Case of Semantic Change Analysis
We propose using automatically generated natural language definitions of
contextualised word usages as interpretable word and word sense
representations. Given a collection of usage examples for a target word, and
the corresponding data-driven usage clusters (i.e., word senses), a definition
is generated for each usage with a specialised Flan-T5 language model, and the
most prototypical definition in a usage cluster is chosen as the sense label.
We demonstrate how the resulting sense labels can make existing approaches to
semantic change analysis more interpretable, and how they can allow users --
historical linguists, lexicographers, or social scientists -- to explore and
intuitively explain diachronic trajectories of word meaning. Semantic change
analysis is only one of many possible applications of the `definitions as
representations' paradigm. Beyond being human-readable, contextualised
definitions also outperform token or usage sentence embeddings in
word-in-context semantic similarity judgements, making them a new promising
type of lexical representation for NLP.Comment: ACL 202
Interpretable Word Sense Representations via Definition Generation: The Case of Semantic Change Analysis
We propose using automatically generated natural language definitions of contextualised word usages as interpretable word and word sense representations. Given a collection of usage examples for a target word, and the corresponding data-driven usage clusters (i.e., word senses), a definition is generated for each usage with a specialised Flan-T5 language model, and the most prototypical definition in a usage cluster is chosen as the sense label. We demonstrate how the resulting sense labels can make existing approaches to semantic change analysis more interpretable, and how they can allow users — historical linguists, lexicographers, or social scientists — to explore and intuitively explain diachronic trajectories of word meaning. Semantic change analysis is only one of many possible applications of the ‘definitions as representations’ paradigm. Beyond being human-readable, contextualised definitions also outperform token or usage sentence embeddings in word-in-context semantic similarity judgements, making them a new promising type of lexical representation for NLP
Lexical semantic change discovery
While there is a large amount of research in the field of Lexical Semantic Change Detection, only few approaches go beyond a standard benchmark evaluation of existing models. In this paper, we propose a shift of focus from change detection to change discovery, i.e., discovering novel word senses over time from the full corpus vocabulary. By heavily fine-tuning a type-based and a token-based approach on recently published German data, we demonstrate that both models can successfully be applied to discover new words undergoing meaning change. Furthermore, we provide an almost fully automated framework for both evaluation and discovery