Search CORE

3,799 research outputs found

Self-Guided Contrastive Learning for BERT Sentence Representations

Author: Kim Taeuk
Lee Sang-goo
Yoo Kang Min
Publication venue
Publication date: 03/06/2021
Field of study

Although BERT and its variants have reshaped the NLP landscape, it still remains unclear how best to derive sentence embeddings from such pre-trained Transformers. In this work, we propose a contrastive learning method that utilizes self-guidance for improving the quality of BERT sentence representations. Our method fine-tunes BERT in a self-supervised fashion, does not rely on data augmentation, and enables the usual [CLS] token embeddings to function as sentence vectors. Moreover, we redesign the contrastive learning objective (NT-Xent) and apply it to sentence representation learning. We demonstrate with extensive experiments that our approach is more effective than competitive baselines on diverse sentence-related tasks. We also show it is efficient at inference and robust to domain shifts.Comment: ACL 202

arXiv.org e-Print Archive

SNU Open Repository and Archive

Computer use at work is associated with self-reported depressive and anxiety disorder

Author: Dongwook Lee
Min-sang Yoo
Mo-Yeol Kang
Taeshik Kim
Yun-Chul Hong
Publication venue: Springer Nature
Publication date: 01/01/2016
Field of study

Adjusted OR* of DAD considering the combined effect of computer use and occupational group, education, and job status. (DOC 61 kb

Springer - Publisher Connector

FigShare

Prompt-Augmented Linear Probing: Scaling Beyond The Limit of Few-shot In-Context Learners

Author: Cho Hyunsoo
Kim Hyuhng Joon
Kim Junyeob
Kim Taeuk
Lee Sang-goo
Lee Sang-Woo
Yoo Kang Min
Publication venue
Publication date: 27/12/2022
Field of study

Through in-context learning (ICL), large-scale language models are effective few-shot learners without additional model fine-tuning. However, the ICL performance does not scale well with the number of available training samples as it is limited by the inherent input length constraint of the underlying language model. Meanwhile, many studies have revealed that language models are also powerful feature extractors, allowing them to be utilized in a black-box manner and enabling the linear probing paradigm, where lightweight discriminators are trained on top of the pre-extracted input representations. This paper proposes prompt-augmented linear probing (PALP), a hybrid of linear probing and ICL, which leverages the best of both worlds. PALP inherits the scalability of linear probing and the capability of enforcing language models to derive more meaningful representations via tailoring input into a more conceivable form. Throughout in-depth investigations on various datasets, we verified that PALP significantly enhances the input representations closing the gap between ICL in the data-hungry scenario and fine-tuning in the data-abundant scenario with little training overhead, potentially making PALP a strong alternative in a black-box scenario.Comment: AAAI 202

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Continuous Decomposition of Granularity for Neural Paraphrase Generation

Author: Gu Xiaodong
Ha Jung-Woo
Lee Sang-Woo
Yoo Kang Min
Zhang Zhaowei
Publication venue
Publication date: 16/09/2022
Field of study

While Transformers have had significant success in paragraph generation, they treat sentences as linear sequences of tokens and often neglect their hierarchical information. Prior work has shown that decomposing the levels of granularity~(e.g., word, phrase, or sentence) for input tokens has produced substantial improvements, suggesting the possibility of enhancing Transformers via more fine-grained modeling of granularity. In this work, we propose a continuous decomposition of granularity for neural paraphrase generation (C-DNPG). In order to efficiently incorporate granularity into sentence encoding, C-DNPG introduces a granularity-aware attention (GA-Attention) mechanism which extends the multi-head self-attention with: 1) a granularity head that automatically infers the hierarchical structure of a sentence by neurally estimating the granularity level of each input token; and 2) two novel attention masks, namely, granularity resonance and granularity scope, to efficiently encode granularity into attention. Experiments on two benchmarks, including Quora question pairs and Twitter URLs have shown that C-DNPG outperforms baseline models by a remarkable margin and achieves state-of-the-art results in terms of many metrics. Qualitative analysis reveals that C-DNPG indeed captures fine-grained levels of granularity with effectiveness.Comment: Accepted to be published in COLING 202

arXiv.org e-Print Archive

Evaluation of Left Atrial Volumes Using Multidetector Computed Tomography: Comparison with Echocardiography

Author: Gweon Hye Mi
Hong Yoo Jin
Kim Sang Jin
Kim Tae Hoon
Lee Sang Min
Rim Se-Joong
Publication venue: The Korean Society of Radiology
Publication date: 01/01/2010
Field of study

OBJECTIVE: To prospectively assess the relationship between the two different measurement methods for the evaluation of left atrial (LA) volume using cardiac multidetector computed tomography (MDCT) and to compare the results between cardiac MDCT and echocardiography. MATERIALS AND METHODS: Thirty-five patients (20 men, 15 women; mean age, 60 years) underwent cardiac MDCT angiography for coronary artery disease. The LA volumes were measured using two different methods: the two dimensional (2D) length-based (LB) method measured along the three-orthogonal planes of the LA and the 3D volumetric threshold-based (VTB) method measured according to the threshold 3D segmentation of the LA. The results obtained by cardiac MDCT were compared with those obtained by echocardiography. RESULTS: The LA end-systolic and end-diastolic volumes (LAESV and LAEDV) measured by the 2D-LB method correlated well with those measured by the 3D-VTB method using cardiac MDCT (r = 0.763, r = 0.786, p = 0.001). However, there was a significant difference in the LAESVs between the two measurement methods using cardiac MDCT (p < 0.05). The LAESV measured by cardiac MDCT correlated well with measurements by echocardiography (r = 0.864, p = 0.001), however with a significant difference (p < 0.01) in their volumes. The cardiac MDCT overestimated the LAESV by 22% compared to measurements by echocardiography. CONCLUSION: A significant correlation was found between the two different measurement methods for evaluating LA volumes by cardiac MDCT. Further, cardiac MDCT correlates well with echocardiography in evaluating the LA volume. However, there are significant differences in the LAESV between the two measurement methods using cardiac MDCT and between cardiac MDCT and echocardiographyope

Yonsei University Medical Library Open Access Repository

PubMed Central