3,799 research outputs found
Self-Guided Contrastive Learning for BERT Sentence Representations
Although BERT and its variants have reshaped the NLP landscape, it still
remains unclear how best to derive sentence embeddings from such pre-trained
Transformers. In this work, we propose a contrastive learning method that
utilizes self-guidance for improving the quality of BERT sentence
representations. Our method fine-tunes BERT in a self-supervised fashion, does
not rely on data augmentation, and enables the usual [CLS] token embeddings to
function as sentence vectors. Moreover, we redesign the contrastive learning
objective (NT-Xent) and apply it to sentence representation learning. We
demonstrate with extensive experiments that our approach is more effective than
competitive baselines on diverse sentence-related tasks. We also show it is
efficient at inference and robust to domain shifts.Comment: ACL 202
Computer use at work is associated with self-reported depressive and anxiety disorder
Adjusted OR* of DAD considering the combined effect of computer use and occupational group, education, and job status. (DOC 61 kb
Prompt-Augmented Linear Probing: Scaling Beyond The Limit of Few-shot In-Context Learners
Through in-context learning (ICL), large-scale language models are effective
few-shot learners without additional model fine-tuning. However, the ICL
performance does not scale well with the number of available training samples
as it is limited by the inherent input length constraint of the underlying
language model. Meanwhile, many studies have revealed that language models are
also powerful feature extractors, allowing them to be utilized in a black-box
manner and enabling the linear probing paradigm, where lightweight
discriminators are trained on top of the pre-extracted input representations.
This paper proposes prompt-augmented linear probing (PALP), a hybrid of linear
probing and ICL, which leverages the best of both worlds. PALP inherits the
scalability of linear probing and the capability of enforcing language models
to derive more meaningful representations via tailoring input into a more
conceivable form. Throughout in-depth investigations on various datasets, we
verified that PALP significantly enhances the input representations closing the
gap between ICL in the data-hungry scenario and fine-tuning in the
data-abundant scenario with little training overhead, potentially making PALP a
strong alternative in a black-box scenario.Comment: AAAI 202
Continuous Decomposition of Granularity for Neural Paraphrase Generation
While Transformers have had significant success in paragraph generation, they
treat sentences as linear sequences of tokens and often neglect their
hierarchical information. Prior work has shown that decomposing the levels of
granularity~(e.g., word, phrase, or sentence) for input tokens has produced
substantial improvements, suggesting the possibility of enhancing Transformers
via more fine-grained modeling of granularity. In this work, we propose a
continuous decomposition of granularity for neural paraphrase generation
(C-DNPG). In order to efficiently incorporate granularity into sentence
encoding, C-DNPG introduces a granularity-aware attention (GA-Attention)
mechanism which extends the multi-head self-attention with: 1) a granularity
head that automatically infers the hierarchical structure of a sentence by
neurally estimating the granularity level of each input token; and 2) two novel
attention masks, namely, granularity resonance and granularity scope, to
efficiently encode granularity into attention. Experiments on two benchmarks,
including Quora question pairs and Twitter URLs have shown that C-DNPG
outperforms baseline models by a remarkable margin and achieves
state-of-the-art results in terms of many metrics. Qualitative analysis reveals
that C-DNPG indeed captures fine-grained levels of granularity with
effectiveness.Comment: Accepted to be published in COLING 202
Evaluation of Left Atrial Volumes Using Multidetector Computed Tomography: Comparison with Echocardiography
OBJECTIVE: To prospectively assess the relationship between the two different measurement methods for the evaluation of left atrial (LA) volume using cardiac multidetector computed tomography (MDCT) and to compare the results between cardiac MDCT and echocardiography. MATERIALS AND METHODS: Thirty-five patients (20 men, 15 women; mean age, 60 years) underwent cardiac MDCT angiography for coronary artery disease. The LA volumes were measured using two different methods: the two dimensional (2D) length-based (LB) method measured along the three-orthogonal planes of the LA and the 3D volumetric threshold-based (VTB) method measured according to the threshold 3D segmentation of the LA. The results obtained by cardiac MDCT were compared with those obtained by echocardiography. RESULTS: The LA end-systolic and end-diastolic volumes (LAESV and LAEDV) measured by the 2D-LB method correlated well with those measured by the 3D-VTB method using cardiac MDCT (r = 0.763, r = 0.786, p = 0.001). However, there was a significant difference in the LAESVs between the two measurement methods using cardiac MDCT (p < 0.05). The LAESV measured by cardiac MDCT correlated well with measurements by echocardiography (r = 0.864, p = 0.001), however with a significant difference (p < 0.01) in their volumes. The cardiac MDCT overestimated the LAESV by 22% compared to measurements by echocardiography. CONCLUSION: A significant correlation was found between the two different measurement methods for evaluating LA volumes by cardiac MDCT. Further, cardiac MDCT correlates well with echocardiography in evaluating the LA volume. However, there are significant differences in the LAESV between the two measurement methods using cardiac MDCT and between cardiac MDCT and echocardiographyope
- …