674 research outputs found
Simple is Better! Lightweight Data Augmentation for Low Resource Slot Filling and Intent Classification
Neural-based models have achieved outstanding performance on slot filling and
intent classification, when fairly large in-domain training data are available.
However, as new domains are frequently added, creating sizeable data is
expensive. We show that lightweight augmentation, a set of augmentation methods
involving word span and sentence level operations, alleviates data scarcity
problems. Our experiments on limited data settings show that lightweight
augmentation yields significant performance improvement on slot filling on the
ATIS and SNIPS datasets, and achieves competitive performance with respect to
more complex, state-of-the-art, augmentation approaches. Furthermore,
lightweight augmentation is also beneficial when combined with pre-trained
LM-based models, as it improves BERT-based joint intent and slot filling
models.Comment: Accepted at PACLIC 2020 - The 34th Pacific Asia Conference on
Language, Information and Computatio
LeksiÄka dekompozicija i razumijevanje glagola kretanja u talijanskom i engleskom jeziku
In this paper, we combine linguistic analyses based on Event Templates (Rappaport Hovav
and Levin 1998a) and psychological proposals on the complexity of verb meanings to
develop an analysis of the proposed complexity differences in motion verbs in English (as
a satelliteāframed language) and Italian (as a verbāframed language). The key prediction
from this analysis is that for both languages mannerāofāmotion verbs take longer to be
processed than pathāmotion verbs: that is to say, independently of the language specific
lexicalization patterns, the more complex the structure, the longer the time to process it.
We also outline some recent findings that have a bearing on this prediction.U ovome radu lingvistiÄka analiza se temelji na predloÅ”cima dogaÄaja (Event Templates,
prema Rappaport, Hovav i Levin 1998a) i psiholoŔkim pretpostavkama o složenosti glagolskih
znaÄenja. Autori kombiniraju navedene pristupe kako bi razvili analizu predloženih složenih
razlika meÄu glagolima kretanja u engleskome (satellite-framed language, odnosno S-jezicima) i
talijanskome (verb-framed language, odnosno V-jezicima). KljuÄna je pretpostavka ove analize da
se u oba jezika glagoli koji opisuju naÄin kretanja sporije procesiraju od onih glagola koji opisuju
put i kretanje. Neovisno o jeziÄno specifiÄnim obrascima leksikalizacije, Å”to je struktura složenija
to je viŔe vremena potrebno da je se procesuira. Ovaj rad istaknuo je i rezultate recentnih
istraživanja koja su bitna za navedenu kljuÄnu pretpostavku u ovome radu
LeksiÄka dekompozicija i razumijevanje glagola kretanja u talijanskom i engleskom jeziku
In this paper, we combine linguistic analyses based on Event Templates (Rappaport Hovav
and Levin 1998a) and psychological proposals on the complexity of verb meanings to
develop an analysis of the proposed complexity differences in motion verbs in English (as
a satelliteāframed language) and Italian (as a verbāframed language). The key prediction
from this analysis is that for both languages mannerāofāmotion verbs take longer to be
processed than pathāmotion verbs: that is to say, independently of the language specific
lexicalization patterns, the more complex the structure, the longer the time to process it.
We also outline some recent findings that have a bearing on this prediction.U ovome radu lingvistiÄka analiza se temelji na predloÅ”cima dogaÄaja (Event Templates,
prema Rappaport, Hovav i Levin 1998a) i psiholoŔkim pretpostavkama o složenosti glagolskih
znaÄenja. Autori kombiniraju navedene pristupe kako bi razvili analizu predloženih složenih
razlika meÄu glagolima kretanja u engleskome (satellite-framed language, odnosno S-jezicima) i
talijanskome (verb-framed language, odnosno V-jezicima). KljuÄna je pretpostavka ove analize da
se u oba jezika glagoli koji opisuju naÄin kretanja sporije procesiraju od onih glagola koji opisuju
put i kretanje. Neovisno o jeziÄno specifiÄnim obrascima leksikalizacije, Å”to je struktura složenija
to je viŔe vremena potrebno da je se procesuira. Ovaj rad istaknuo je i rezultate recentnih
istraživanja koja su bitna za navedenu kljuÄnu pretpostavku u ovome radu
Real-World Compositional Generalization with Disentangled Sequence-to-Sequence Learning
Compositional generalization is a basic mechanism in human language learning,
which current neural networks struggle with. A recently proposed Disentangled
sequence-to-sequence model (Dangle) shows promising generalization capability
by learning specialized encodings for each decoding step. We introduce two key
modifications to this model which encourage more disentangled representations
and improve its compute and memory efficiency, allowing us to tackle
compositional generalization in a more realistic setting. Specifically, instead
of adaptively re-encoding source keys and values at each time step, we
disentangle their representations and only re-encode keys periodically, at some
interval. Our new architecture leads to better generalization performance
across existing tasks and datasets, and a new machine translation benchmark
which we create by detecting naturally occurring compositional patterns in
relation to a training set. We show this methodology better emulates real-world
requirements than artificial challenges
SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization
Data scarcity has been a long standing issue in the field of open-domain
social dialogue. To quench this thirst, we present SODA: the first publicly
available, million-scale high-quality social dialogue dataset. By
contextualizing social commonsense knowledge from a knowledge graph, we are
able to distill an exceptionally broad spectrum of social interactions from a
large language model. Human evaluation shows that conversations in SODA are
more consistent, specific, and (surprisingly) natural than those in prior
human-authored datasets.
Using SODA, we train COSMO: a generalizable conversation model that is
significantly more natural and consistent on unseen datasets than
best-performing conversation models (e.g., GODEL, BlenderBot-1, Koala, Vicuna).
Experiments reveal COSMO is sometimes even preferred to the original
human-written gold responses. Additionally, our results shed light on the
distinction between knowledge-enriched conversations and natural social
chitchats. We plan to make our data, model, and code public.Comment: EMNLP 2023. Dataset, model, and code can be found at
https://hyunw.kim/sodavers
- ā¦