Search CORE

2,586 research outputs found

Learning Topic-Sensitive Word Representations

Author: Bisazza Arianna
Fadaee Marzieh
Monz Christof
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

Distributed word representations are widely used for modeling words in NLP tasks. Most of the existing models generate one representation per word and do not consider different meanings of a word. We present two approaches to learn multiple topic-sensitive representations per word by using Hierarchical Dirichlet Process. We observe that by modeling topics and integrating topic distributions for each document we obtain representations that are able to distinguish between different meanings of a given word. Our models yield statistically significant improvements for the lexical substitution task indicating that commonly used single word representations, even when combined with contextual information, are insufficient for this task.Comment: 5 pages, 1 figure, Accepted at ACL 201

arXiv.org e-Print Archive

UvA-DARE

International Migration, Integration and Social Cohesion online publications

The Sensitivity of Language Models and Humans to Winograd Schema Perturbations

Author: Abdou Mostafa
Barrett Maria
Belinkov Yonatan
Elliott Desmond
Ravishankar Vinit
Søgaard Anders
Publication venue
Publication date: 01/01/2020
Field of study

Large-scale pretrained language models are the major driving force behind recent improvements in performance on the Winograd Schema Challenge, a widely employed test of common sense reasoning ability. We show, however, with a new diagnostic dataset, that these models are sensitive to linguistic perturbations of the Winograd examples that minimally affect human understanding. Our results highlight interesting differences between humans and language models: language models are more sensitive to number or gender alternations and synonym replacements than humans, and humans are more stable and consistent in their predictions, maintain a much higher absolute performance, and perform better on non-associative instances than associative ones. Overall, humans are correct more often than out-of-the-box models, and the models are sometimes right for the wrong reasons. Finally, we show that fine-tuning on a large, task-specific dataset can offer a solution to these issues.Comment: ACL 202

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

Always Keep your Target in Mind: Studying Semantics and Improving Performance of Neural Lexical Substitution

Author: Arefyev Nikolay
Panchenko Alexander
Podolskiy Alexander
Sheludko Boris
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 07/06/2022
Field of study

Lexical substitution, i.e. generation of plausible words that can replace a particular target word in a given context, is an extremely powerful technology that can be used as a backbone of various NLP applications, including word sense induction and disambiguation, lexical relation extraction, data augmentation, etc. In this paper, we present a large-scale comparative study of lexical substitution methods employing both rather old and most recent language and masked language models (LMs and MLMs), such as context2vec, ELMo, BERT, RoBERTa, XLNet. We show that already competitive results achieved by SOTA LMs/MLMs can be further substantially improved if information about the target word is injected properly. Several existing and new target word injection methods are compared for each LM/MLM using both intrinsic evaluation on lexical substitution datasets and extrinsic evaluation on word sense induction (WSI) datasets. On two WSI datasets we obtain new SOTA results. Besides, we analyze the types of semantic relations between target words and their substitutes generated by different models or given by annotators.Comment: arXiv admin note: text overlap with arXiv:2006.0003

arXiv.org e-Print Archive