74 research outputs found
Self-Guided Contrastive Learning for BERT Sentence Representations
Although BERT and its variants have reshaped the NLP landscape, it still
remains unclear how best to derive sentence embeddings from such pre-trained
Transformers. In this work, we propose a contrastive learning method that
utilizes self-guidance for improving the quality of BERT sentence
representations. Our method fine-tunes BERT in a self-supervised fashion, does
not rely on data augmentation, and enables the usual [CLS] token embeddings to
function as sentence vectors. Moreover, we redesign the contrastive learning
objective (NT-Xent) and apply it to sentence representation learning. We
demonstrate with extensive experiments that our approach is more effective than
competitive baselines on diverse sentence-related tasks. We also show it is
efficient at inference and robust to domain shifts.Comment: ACL 202
IT Investment Portfolio for Mobile Office
The adoption of mobile office is a topic that has been researched for many years due to rapid growth in the use of wireless communication and portable devices. Yet within the vast territory of studies of the IT investment portfolio for mobile office there remains a large piece of uncharted terrain. The aim of this article is to empirically examine the IT investment portfolio framework, with an emphasis on mobile office environment. Using theoretical framework of IT investment portfolio, our hypotheses concerned the effect of IT investment portfolio on the performance of mobile business service moderated by mobile savvy. To measure mobile office performance and mobile savvy, we conducted the survey with a total of 127 participants
Probing Out-of-Distribution Robustness of Language Models with Parameter-Efficient Transfer Learning
As the size of the pre-trained language model (PLM) continues to increase,
numerous parameter-efficient transfer learning methods have been proposed
recently to compensate for the tremendous cost of fine-tuning. Despite the
impressive results achieved by large pre-trained language models (PLMs) and
various parameter-efficient transfer learning (PETL) methods on sundry
benchmarks, it remains unclear if they can handle inputs that have been
distributionally shifted effectively. In this study, we systematically explore
how the ability to detect out-of-distribution (OOD) changes as the size of the
PLM grows or the transfer methods are altered. Specifically, we evaluated
various PETL techniques, including fine-tuning, Adapter, LoRA, and
prefix-tuning, on three different intention classification tasks, each
utilizing various language models with different scales.Comment: *SEM 202
Prompt-Augmented Linear Probing: Scaling Beyond The Limit of Few-shot In-Context Learners
Through in-context learning (ICL), large-scale language models are effective
few-shot learners without additional model fine-tuning. However, the ICL
performance does not scale well with the number of available training samples
as it is limited by the inherent input length constraint of the underlying
language model. Meanwhile, many studies have revealed that language models are
also powerful feature extractors, allowing them to be utilized in a black-box
manner and enabling the linear probing paradigm, where lightweight
discriminators are trained on top of the pre-extracted input representations.
This paper proposes prompt-augmented linear probing (PALP), a hybrid of linear
probing and ICL, which leverages the best of both worlds. PALP inherits the
scalability of linear probing and the capability of enforcing language models
to derive more meaningful representations via tailoring input into a more
conceivable form. Throughout in-depth investigations on various datasets, we
verified that PALP significantly enhances the input representations closing the
gap between ICL in the data-hungry scenario and fine-tuning in the
data-abundant scenario with little training overhead, potentially making PALP a
strong alternative in a black-box scenario.Comment: AAAI 202
Universal Domain Adaptation for Robust Handling of Distributional Shifts in NLP
When deploying machine learning systems to the wild, it is highly desirable
for them to effectively leverage prior knowledge to the unfamiliar domain while
also firing alarms to anomalous inputs. In order to address these requirements,
Universal Domain Adaptation (UniDA) has emerged as a novel research area in
computer vision, focusing on achieving both adaptation ability and robustness
(i.e., the ability to detect out-of-distribution samples). While UniDA has led
significant progress in computer vision, its application on language input
still needs to be explored despite its feasibility. In this paper, we propose a
comprehensive benchmark for natural language that offers thorough viewpoints of
the model's generalizability and robustness. Our benchmark encompasses multiple
datasets with varying difficulty levels and characteristics, including temporal
shifts and diverse domains. On top of our testbed, we validate existing UniDA
methods from computer vision and state-of-the-art domain adaptation techniques
from NLP literature, yielding valuable findings: We observe that UniDA methods
originally designed for image input can be effectively transferred to the
natural language domain while also underscoring the effect of adaptation
difficulty in determining the model's performance.Comment: Findings of EMNLP 202
- …