674 research outputs found

    Simple is Better! Lightweight Data Augmentation for Low Resource Slot Filling and Intent Classification

    Full text link
    Neural-based models have achieved outstanding performance on slot filling and intent classification, when fairly large in-domain training data are available. However, as new domains are frequently added, creating sizeable data is expensive. We show that lightweight augmentation, a set of augmentation methods involving word span and sentence level operations, alleviates data scarcity problems. Our experiments on limited data settings show that lightweight augmentation yields significant performance improvement on slot filling on the ATIS and SNIPS datasets, and achieves competitive performance with respect to more complex, state-of-the-art, augmentation approaches. Furthermore, lightweight augmentation is also beneficial when combined with pre-trained LM-based models, as it improves BERT-based joint intent and slot filling models.Comment: Accepted at PACLIC 2020 - The 34th Pacific Asia Conference on Language, Information and Computatio

    Leksička dekompozicija i razumijevanje glagola kretanja u talijanskom i engleskom jeziku

    Get PDF
    In this paper, we combine linguistic analyses based on Event Templates (Rappaport Hovav and Levin 1998a) and psychological proposals on the complexity of verb meanings to develop an analysis of the proposed complexity differences in motion verbs in English (as a satelliteā€“framed language) and Italian (as a verbā€“framed language). The key prediction from this analysis is that for both languages mannerā€“ofā€“motion verbs take longer to be processed than pathā€“motion verbs: that is to say, independently of the language specific lexicalization patterns, the more complex the structure, the longer the time to process it. We also outline some recent findings that have a bearing on this prediction.U ovome radu lingvistička analiza se temelji na predloÅ”cima događaja (Event Templates, prema Rappaport, Hovav i Levin 1998a) i psiholoÅ”kim pretpostavkama o složenosti glagolskih značenja. Autori kombiniraju navedene pristupe kako bi razvili analizu predloženih složenih razlika među glagolima kretanja u engleskome (satellite-framed language, odnosno S-jezicima) i talijanskome (verb-framed language, odnosno V-jezicima). Ključna je pretpostavka ove analize da se u oba jezika glagoli koji opisuju način kretanja sporije procesiraju od onih glagola koji opisuju put i kretanje. Neovisno o jezično specifičnim obrascima leksikalizacije, Å”to je struktura složenija to je viÅ”e vremena potrebno da je se procesuira. Ovaj rad istaknuo je i rezultate recentnih istraživanja koja su bitna za navedenu ključnu pretpostavku u ovome radu

    Leksička dekompozicija i razumijevanje glagola kretanja u talijanskom i engleskom jeziku

    Get PDF
    In this paper, we combine linguistic analyses based on Event Templates (Rappaport Hovav and Levin 1998a) and psychological proposals on the complexity of verb meanings to develop an analysis of the proposed complexity differences in motion verbs in English (as a satelliteā€“framed language) and Italian (as a verbā€“framed language). The key prediction from this analysis is that for both languages mannerā€“ofā€“motion verbs take longer to be processed than pathā€“motion verbs: that is to say, independently of the language specific lexicalization patterns, the more complex the structure, the longer the time to process it. We also outline some recent findings that have a bearing on this prediction.U ovome radu lingvistička analiza se temelji na predloÅ”cima događaja (Event Templates, prema Rappaport, Hovav i Levin 1998a) i psiholoÅ”kim pretpostavkama o složenosti glagolskih značenja. Autori kombiniraju navedene pristupe kako bi razvili analizu predloženih složenih razlika među glagolima kretanja u engleskome (satellite-framed language, odnosno S-jezicima) i talijanskome (verb-framed language, odnosno V-jezicima). Ključna je pretpostavka ove analize da se u oba jezika glagoli koji opisuju način kretanja sporije procesiraju od onih glagola koji opisuju put i kretanje. Neovisno o jezično specifičnim obrascima leksikalizacije, Å”to je struktura složenija to je viÅ”e vremena potrebno da je se procesuira. Ovaj rad istaknuo je i rezultate recentnih istraživanja koja su bitna za navedenu ključnu pretpostavku u ovome radu

    Real-World Compositional Generalization with Disentangled Sequence-to-Sequence Learning

    Full text link
    Compositional generalization is a basic mechanism in human language learning, which current neural networks struggle with. A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability by learning specialized encodings for each decoding step. We introduce two key modifications to this model which encourage more disentangled representations and improve its compute and memory efficiency, allowing us to tackle compositional generalization in a more realistic setting. Specifically, instead of adaptively re-encoding source keys and values at each time step, we disentangle their representations and only re-encode keys periodically, at some interval. Our new architecture leads to better generalization performance across existing tasks and datasets, and a new machine translation benchmark which we create by detecting naturally occurring compositional patterns in relation to a training set. We show this methodology better emulates real-world requirements than artificial challenges

    SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization

    Full text link
    Data scarcity has been a long standing issue in the field of open-domain social dialogue. To quench this thirst, we present SODA: the first publicly available, million-scale high-quality social dialogue dataset. By contextualizing social commonsense knowledge from a knowledge graph, we are able to distill an exceptionally broad spectrum of social interactions from a large language model. Human evaluation shows that conversations in SODA are more consistent, specific, and (surprisingly) natural than those in prior human-authored datasets. Using SODA, we train COSMO: a generalizable conversation model that is significantly more natural and consistent on unseen datasets than best-performing conversation models (e.g., GODEL, BlenderBot-1, Koala, Vicuna). Experiments reveal COSMO is sometimes even preferred to the original human-written gold responses. Additionally, our results shed light on the distinction between knowledge-enriched conversations and natural social chitchats. We plan to make our data, model, and code public.Comment: EMNLP 2023. Dataset, model, and code can be found at https://hyunw.kim/sodavers
    • ā€¦
    corecore