Search CORE

155 research outputs found

Ditto: A Simple and Efficient Approach to Improve Sentence Embeddings

Author: Chen Qian
Deng Chong
Liu Jiaqing
Ma Yukun
Wang Wen
Yu Hai
Zhang Chong
Zhang Qinglin
Zheng Siqi
Publication venue
Publication date: 23/10/2023
Field of study

Prior studies diagnose the anisotropy problem in sentence representations from pre-trained language models, e.g., BERT, without fine-tuning. Our analysis reveals that the sentence embeddings from BERT suffer from a bias towards uninformative words, limiting the performance in semantic textual similarity (STS) tasks. To address this bias, we propose a simple and efficient unsupervised approach, Diagonal Attention Pooling (Ditto), which weights words with model-based importance estimations and computes the weighted average of word representations from pre-trained models as sentence embeddings. Ditto can be easily applied to any pre-trained language model as a postprocessing operation. Compared to prior sentence embedding approaches, Ditto does not add parameters nor requires any learning. Empirical evaluations demonstrate that our proposed Ditto can alleviate the anisotropy problem and improve various pre-trained models on STS tasks.Comment: 8 pages, accepted by EMNLP 2023 short paper, the source code can be found at https://github.com/alibaba-damo-academy/SpokenNLP/tree/main/ditt

arXiv.org e-Print Archive

Dynabench: Rethinking Benchmarking in NLP

Author: Bansal Mohit
Bartolo Max
Geiger Atticus
Jia Robin
Kaushik Divyansh
Kiela Douwe
Ma Zhiyi
Nie Yixin
Potts Christopher
Prasad Grusha
Riedel Sebastian
Ringshia Pratik
Singh Amanpreet
Stenetorp Pontus
Thrush Tristan
Vidgen Bertie
Waseem Zeerak
Williams Adina
Wu Zhengxuan
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 11/06/2021
Field of study

We introduce Dynabench, an open-source platform for dynamic dataset creation and model benchmarking. Dynabench runs in a web browser and supports human-and-model-in-the-loop dataset creation: annotators seek to create examples that a target model will misclassify, but that another person will not. In this paper, we argue that Dynabench addresses a critical need in our community: contemporary models quickly achieve outstanding performance on benchmark tasks but nonetheless fail on simple challenge examples and falter in real-world scenarios. With Dynabench, dataset creation, model development, and model assessment can directly inform each other, leading to more robust and informative benchmarks. We report on four initial NLP tasks, illustrating these concepts and highlighting the promise of the platform, and address potential objections to dynamic benchmarking as a new standard for the field

UCL Discovery