3,799 research outputs found

    Self-Guided Contrastive Learning for BERT Sentence Representations

    Full text link
    Although BERT and its variants have reshaped the NLP landscape, it still remains unclear how best to derive sentence embeddings from such pre-trained Transformers. In this work, we propose a contrastive learning method that utilizes self-guidance for improving the quality of BERT sentence representations. Our method fine-tunes BERT in a self-supervised fashion, does not rely on data augmentation, and enables the usual [CLS] token embeddings to function as sentence vectors. Moreover, we redesign the contrastive learning objective (NT-Xent) and apply it to sentence representation learning. We demonstrate with extensive experiments that our approach is more effective than competitive baselines on diverse sentence-related tasks. We also show it is efficient at inference and robust to domain shifts.Comment: ACL 202

    Computer use at work is associated with self-reported depressive and anxiety disorder

    Get PDF
    Adjusted OR* of DAD considering the combined effect of computer use and occupational group, education, and job status. (DOC 61 kb

    Prompt-Augmented Linear Probing: Scaling Beyond The Limit of Few-shot In-Context Learners

    Full text link
    Through in-context learning (ICL), large-scale language models are effective few-shot learners without additional model fine-tuning. However, the ICL performance does not scale well with the number of available training samples as it is limited by the inherent input length constraint of the underlying language model. Meanwhile, many studies have revealed that language models are also powerful feature extractors, allowing them to be utilized in a black-box manner and enabling the linear probing paradigm, where lightweight discriminators are trained on top of the pre-extracted input representations. This paper proposes prompt-augmented linear probing (PALP), a hybrid of linear probing and ICL, which leverages the best of both worlds. PALP inherits the scalability of linear probing and the capability of enforcing language models to derive more meaningful representations via tailoring input into a more conceivable form. Throughout in-depth investigations on various datasets, we verified that PALP significantly enhances the input representations closing the gap between ICL in the data-hungry scenario and fine-tuning in the data-abundant scenario with little training overhead, potentially making PALP a strong alternative in a black-box scenario.Comment: AAAI 202

    Continuous Decomposition of Granularity for Neural Paraphrase Generation

    Full text link
    While Transformers have had significant success in paragraph generation, they treat sentences as linear sequences of tokens and often neglect their hierarchical information. Prior work has shown that decomposing the levels of granularity~(e.g., word, phrase, or sentence) for input tokens has produced substantial improvements, suggesting the possibility of enhancing Transformers via more fine-grained modeling of granularity. In this work, we propose a continuous decomposition of granularity for neural paraphrase generation (C-DNPG). In order to efficiently incorporate granularity into sentence encoding, C-DNPG introduces a granularity-aware attention (GA-Attention) mechanism which extends the multi-head self-attention with: 1) a granularity head that automatically infers the hierarchical structure of a sentence by neurally estimating the granularity level of each input token; and 2) two novel attention masks, namely, granularity resonance and granularity scope, to efficiently encode granularity into attention. Experiments on two benchmarks, including Quora question pairs and Twitter URLs have shown that C-DNPG outperforms baseline models by a remarkable margin and achieves state-of-the-art results in terms of many metrics. Qualitative analysis reveals that C-DNPG indeed captures fine-grained levels of granularity with effectiveness.Comment: Accepted to be published in COLING 202

    Evaluation of Left Atrial Volumes Using Multidetector Computed Tomography: Comparison with Echocardiography

    Get PDF
    OBJECTIVE: To prospectively assess the relationship between the two different measurement methods for the evaluation of left atrial (LA) volume using cardiac multidetector computed tomography (MDCT) and to compare the results between cardiac MDCT and echocardiography. MATERIALS AND METHODS: Thirty-five patients (20 men, 15 women; mean age, 60 years) underwent cardiac MDCT angiography for coronary artery disease. The LA volumes were measured using two different methods: the two dimensional (2D) length-based (LB) method measured along the three-orthogonal planes of the LA and the 3D volumetric threshold-based (VTB) method measured according to the threshold 3D segmentation of the LA. The results obtained by cardiac MDCT were compared with those obtained by echocardiography. RESULTS: The LA end-systolic and end-diastolic volumes (LAESV and LAEDV) measured by the 2D-LB method correlated well with those measured by the 3D-VTB method using cardiac MDCT (r = 0.763, r = 0.786, p = 0.001). However, there was a significant difference in the LAESVs between the two measurement methods using cardiac MDCT (p < 0.05). The LAESV measured by cardiac MDCT correlated well with measurements by echocardiography (r = 0.864, p = 0.001), however with a significant difference (p < 0.01) in their volumes. The cardiac MDCT overestimated the LAESV by 22% compared to measurements by echocardiography. CONCLUSION: A significant correlation was found between the two different measurement methods for evaluating LA volumes by cardiac MDCT. Further, cardiac MDCT correlates well with echocardiography in evaluating the LA volume. However, there are significant differences in the LAESV between the two measurement methods using cardiac MDCT and between cardiac MDCT and echocardiographyope
    corecore