2,586 research outputs found
Learning Topic-Sensitive Word Representations
Distributed word representations are widely used for modeling words in NLP
tasks. Most of the existing models generate one representation per word and do
not consider different meanings of a word. We present two approaches to learn
multiple topic-sensitive representations per word by using Hierarchical
Dirichlet Process. We observe that by modeling topics and integrating topic
distributions for each document we obtain representations that are able to
distinguish between different meanings of a given word. Our models yield
statistically significant improvements for the lexical substitution task
indicating that commonly used single word representations, even when combined
with contextual information, are insufficient for this task.Comment: 5 pages, 1 figure, Accepted at ACL 201
The Sensitivity of Language Models and Humans to Winograd Schema Perturbations
Large-scale pretrained language models are the major driving force behind
recent improvements in performance on the Winograd Schema Challenge, a widely
employed test of common sense reasoning ability. We show, however, with a new
diagnostic dataset, that these models are sensitive to linguistic perturbations
of the Winograd examples that minimally affect human understanding. Our results
highlight interesting differences between humans and language models: language
models are more sensitive to number or gender alternations and synonym
replacements than humans, and humans are more stable and consistent in their
predictions, maintain a much higher absolute performance, and perform better on
non-associative instances than associative ones. Overall, humans are correct
more often than out-of-the-box models, and the models are sometimes right for
the wrong reasons. Finally, we show that fine-tuning on a large, task-specific
dataset can offer a solution to these issues.Comment: ACL 202
Always Keep your Target in Mind: Studying Semantics and Improving Performance of Neural Lexical Substitution
Lexical substitution, i.e. generation of plausible words that can replace a
particular target word in a given context, is an extremely powerful technology
that can be used as a backbone of various NLP applications, including word
sense induction and disambiguation, lexical relation extraction, data
augmentation, etc. In this paper, we present a large-scale comparative study of
lexical substitution methods employing both rather old and most recent language
and masked language models (LMs and MLMs), such as context2vec, ELMo, BERT,
RoBERTa, XLNet. We show that already competitive results achieved by SOTA
LMs/MLMs can be further substantially improved if information about the target
word is injected properly. Several existing and new target word injection
methods are compared for each LM/MLM using both intrinsic evaluation on lexical
substitution datasets and extrinsic evaluation on word sense induction (WSI)
datasets. On two WSI datasets we obtain new SOTA results. Besides, we analyze
the types of semantic relations between target words and their substitutes
generated by different models or given by annotators.Comment: arXiv admin note: text overlap with arXiv:2006.0003
- …