148 research outputs found
A Survey of Multi-task Learning in Natural Language Processing: Regarding Task Relatedness and Training Methods
Multi-task learning (MTL) has become increasingly popular in natural language
processing (NLP) because it improves the performance of related tasks by
exploiting their commonalities and differences. Nevertheless, it is still not
understood very well how multi-task learning can be implemented based on the
relatedness of training tasks. In this survey, we review recent advances of
multi-task learning methods in NLP, with the aim of summarizing them into two
general multi-task training methods based on their task relatedness: (i) joint
training and (ii) multi-step training. We present examples in various NLP
downstream applications, summarize the task relationships and discuss future
directions of this promising topic.Comment: Accepted to EACL 2023 as regular long pape
GL-CLeF: A Global-Local Contrastive Learning Framework for Cross-lingual Spoken Language Understanding
Due to high data demands of current methods, attention to zero-shot
cross-lingual spoken language understanding (SLU) has grown, as such approaches
greatly reduce human annotation effort. However, existing models solely rely on
shared parameters, which can only perform implicit alignment across languages.
We present Global--Local Contrastive Learning Framework (GL-CLeF) to address
this shortcoming. Specifically, we employ contrastive learning, leveraging
bilingual dictionaries to construct multilingual views of the same utterance,
then encourage their representations to be more similar than negative example
pairs, which achieves to explicitly aligned representations of similar
sentences across languages. In addition, a key step in GL-CLeF is a proposed
Local and Global component, which achieves a fine-grained cross-lingual
transfer (i.e., sentence-level Local intent transfer, token-level Local slot
transfer, and semantic-level Global transfer across intent and slot).
Experiments on MultiATIS++ show that GL-CLeF achieves the best performance and
successfully pulls representations of similar sentences across languages
closer.Comment: Accepted at ACL2022 Main Conferenc
Robustification of Multilingual Language Models to Real-world Noise with Robust Contrastive Pretraining
Advances in neural modeling have achieved state-of-the-art (SOTA) results on
public natural language processing (NLP) benchmarks, at times surpassing human
performance. However, there is a gap between public benchmarks and real-world
applications where noise such as typos or grammatical mistakes is abundant,
resulting in degraded performance. Unfortunately, works that assess the
robustness of neural models on noisy data and suggest improvements are limited
to the English language. Upon analyzing noise in different languages, we
observe that noise types vary across languages and thus require their own
investigation. Thus, to benchmark the performance of pretrained multilingual
models, we construct noisy datasets covering five languages and four NLP tasks.
We see a gap in performance between clean and noisy data. After investigating
ways to boost the zero-shot cross-lingual robustness of multilingual pretrained
models, we propose Robust Contrastive Pretraining (RCP). RCP combines data
augmentation with a contrastive loss term at the pretraining stage and achieves
large improvements on noisy (& original test data) across two sentence-level
classification (+3.2%) and two sequence-labeling (+10 F1-score) multilingual
tasks
- …