317 research outputs found
Self-Guided Contrastive Learning for BERT Sentence Representations
Although BERT and its variants have reshaped the NLP landscape, it still
remains unclear how best to derive sentence embeddings from such pre-trained
Transformers. In this work, we propose a contrastive learning method that
utilizes self-guidance for improving the quality of BERT sentence
representations. Our method fine-tunes BERT in a self-supervised fashion, does
not rely on data augmentation, and enables the usual [CLS] token embeddings to
function as sentence vectors. Moreover, we redesign the contrastive learning
objective (NT-Xent) and apply it to sentence representation learning. We
demonstrate with extensive experiments that our approach is more effective than
competitive baselines on diverse sentence-related tasks. We also show it is
efficient at inference and robust to domain shifts.Comment: ACL 202
Probing Out-of-Distribution Robustness of Language Models with Parameter-Efficient Transfer Learning
As the size of the pre-trained language model (PLM) continues to increase,
numerous parameter-efficient transfer learning methods have been proposed
recently to compensate for the tremendous cost of fine-tuning. Despite the
impressive results achieved by large pre-trained language models (PLMs) and
various parameter-efficient transfer learning (PETL) methods on sundry
benchmarks, it remains unclear if they can handle inputs that have been
distributionally shifted effectively. In this study, we systematically explore
how the ability to detect out-of-distribution (OOD) changes as the size of the
PLM grows or the transfer methods are altered. Specifically, we evaluated
various PETL techniques, including fine-tuning, Adapter, LoRA, and
prefix-tuning, on three different intention classification tasks, each
utilizing various language models with different scales.Comment: *SEM 202
- …