4,315 research outputs found

    Data Augmentation for Spoken Language Understanding via Joint Variational Generation

    Full text link
    Data scarcity is one of the main obstacles of domain adaptation in spoken language understanding (SLU) due to the high cost of creating manually tagged SLU datasets. Recent works in neural text generative models, particularly latent variable models such as variational autoencoder (VAE), have shown promising results in regards to generating plausible and natural sentences. In this paper, we propose a novel generative architecture which leverages the generative power of latent variable models to jointly synthesize fully annotated utterances. Our experiments show that existing SLU models trained on the additional synthetic examples achieve performance gains. Our approach not only helps alleviate the data scarcity issue in the SLU task for many datasets but also indiscriminately improves language understanding performances for various SLU models, supported by extensive experiments and rigorous statistical testing.Comment: 8 pages, 3 figures, 4 tables, Accepted in AAAI201

    Learning to Compose Task-Specific Tree Structures

    Full text link
    For years, recursive neural networks (RvNNs) have been shown to be suitable for representing text into fixed-length vectors and achieved good performance on several natural language processing tasks. However, the main drawback of RvNNs is that they require structured input, which makes data preparation and model implementation hard. In this paper, we propose Gumbel Tree-LSTM, a novel tree-structured long short-term memory architecture that learns how to compose task-specific tree structures only from plain text data efficiently. Our model uses Straight-Through Gumbel-Softmax estimator to decide the parent node among candidates dynamically and to calculate gradients of the discrete decision. We evaluate the proposed model on natural language inference and sentiment analysis, and show that our model outperforms or is at least comparable to previous models. We also find that our model converges significantly faster than other models.Comment: AAAI 201

    Self-Guided Contrastive Learning for BERT Sentence Representations

    Full text link
    Although BERT and its variants have reshaped the NLP landscape, it still remains unclear how best to derive sentence embeddings from such pre-trained Transformers. In this work, we propose a contrastive learning method that utilizes self-guidance for improving the quality of BERT sentence representations. Our method fine-tunes BERT in a self-supervised fashion, does not rely on data augmentation, and enables the usual [CLS] token embeddings to function as sentence vectors. Moreover, we redesign the contrastive learning objective (NT-Xent) and apply it to sentence representation learning. We demonstrate with extensive experiments that our approach is more effective than competitive baselines on diverse sentence-related tasks. We also show it is efficient at inference and robust to domain shifts.Comment: ACL 202

    Computer use at work is associated with self-reported depressive and anxiety disorder

    Get PDF
    Adjusted OR* of DAD considering the combined effect of computer use and occupational group, education, and job status. (DOC 61 kb

    Prompt-Augmented Linear Probing: Scaling Beyond The Limit of Few-shot In-Context Learners

    Full text link
    Through in-context learning (ICL), large-scale language models are effective few-shot learners without additional model fine-tuning. However, the ICL performance does not scale well with the number of available training samples as it is limited by the inherent input length constraint of the underlying language model. Meanwhile, many studies have revealed that language models are also powerful feature extractors, allowing them to be utilized in a black-box manner and enabling the linear probing paradigm, where lightweight discriminators are trained on top of the pre-extracted input representations. This paper proposes prompt-augmented linear probing (PALP), a hybrid of linear probing and ICL, which leverages the best of both worlds. PALP inherits the scalability of linear probing and the capability of enforcing language models to derive more meaningful representations via tailoring input into a more conceivable form. Throughout in-depth investigations on various datasets, we verified that PALP significantly enhances the input representations closing the gap between ICL in the data-hungry scenario and fine-tuning in the data-abundant scenario with little training overhead, potentially making PALP a strong alternative in a black-box scenario.Comment: AAAI 202

    Continuous Decomposition of Granularity for Neural Paraphrase Generation

    Full text link
    While Transformers have had significant success in paragraph generation, they treat sentences as linear sequences of tokens and often neglect their hierarchical information. Prior work has shown that decomposing the levels of granularity~(e.g., word, phrase, or sentence) for input tokens has produced substantial improvements, suggesting the possibility of enhancing Transformers via more fine-grained modeling of granularity. In this work, we propose a continuous decomposition of granularity for neural paraphrase generation (C-DNPG). In order to efficiently incorporate granularity into sentence encoding, C-DNPG introduces a granularity-aware attention (GA-Attention) mechanism which extends the multi-head self-attention with: 1) a granularity head that automatically infers the hierarchical structure of a sentence by neurally estimating the granularity level of each input token; and 2) two novel attention masks, namely, granularity resonance and granularity scope, to efficiently encode granularity into attention. Experiments on two benchmarks, including Quora question pairs and Twitter URLs have shown that C-DNPG outperforms baseline models by a remarkable margin and achieves state-of-the-art results in terms of many metrics. Qualitative analysis reveals that C-DNPG indeed captures fine-grained levels of granularity with effectiveness.Comment: Accepted to be published in COLING 202

    Evaluation of Left Atrial Volumes Using Multidetector Computed Tomography: Comparison with Echocardiography

    Get PDF
    OBJECTIVE: To prospectively assess the relationship between the two different measurement methods for the evaluation of left atrial (LA) volume using cardiac multidetector computed tomography (MDCT) and to compare the results between cardiac MDCT and echocardiography. MATERIALS AND METHODS: Thirty-five patients (20 men, 15 women; mean age, 60 years) underwent cardiac MDCT angiography for coronary artery disease. The LA volumes were measured using two different methods: the two dimensional (2D) length-based (LB) method measured along the three-orthogonal planes of the LA and the 3D volumetric threshold-based (VTB) method measured according to the threshold 3D segmentation of the LA. The results obtained by cardiac MDCT were compared with those obtained by echocardiography. RESULTS: The LA end-systolic and end-diastolic volumes (LAESV and LAEDV) measured by the 2D-LB method correlated well with those measured by the 3D-VTB method using cardiac MDCT (r = 0.763, r = 0.786, p = 0.001). However, there was a significant difference in the LAESVs between the two measurement methods using cardiac MDCT (p < 0.05). The LAESV measured by cardiac MDCT correlated well with measurements by echocardiography (r = 0.864, p = 0.001), however with a significant difference (p < 0.01) in their volumes. The cardiac MDCT overestimated the LAESV by 22% compared to measurements by echocardiography. CONCLUSION: A significant correlation was found between the two different measurement methods for evaluating LA volumes by cardiac MDCT. Further, cardiac MDCT correlates well with echocardiography in evaluating the LA volume. However, there are significant differences in the LAESV between the two measurement methods using cardiac MDCT and between cardiac MDCT and echocardiographyope
    corecore