1,550 research outputs found
Attention-Informed Mixed-Language Training for Zero-shot Cross-lingual Task-oriented Dialogue Systems
Recently, data-driven task-oriented dialogue systems have achieved promising
performance in English. However, developing dialogue systems that support
low-resource languages remains a long-standing challenge due to the absence of
high-quality data. In order to circumvent the expensive and time-consuming data
collection, we introduce Attention-Informed Mixed-Language Training (MLT), a
novel zero-shot adaptation method for cross-lingual task-oriented dialogue
systems. It leverages very few task-related parallel word pairs to generate
code-switching sentences for learning the inter-lingual semantics across
languages. Instead of manually selecting the word pairs, we propose to extract
source words based on the scores computed by the attention layer of a trained
English task-related model and then generate word pairs using existing
bilingual dictionaries. Furthermore, intensive experiments with different
cross-lingual embeddings demonstrate the effectiveness of our approach.
Finally, with very few word pairs, our model achieves significant zero-shot
adaptation performance improvements in both cross-lingual dialogue state
tracking and natural language understanding (i.e., intent detection and slot
filling) tasks compared to the current state-of-the-art approaches, which
utilize a much larger amount of bilingual data.Comment: Accepted as an oral presentation in AAAI 202
Cross-lingual Intermediate Fine-tuning improves Dialogue State Tracking
Recent progress in task-oriented neural dialogue systems is largely focused
on a handful of languages, as annotation of training data is tedious and
expensive. Machine translation has been used to make systems multilingual, but
this can introduce a pipeline of errors. Another promising solution is using
cross-lingual transfer learning through pretrained multilingual models.
Existing methods train multilingual models with additional code-mixed task data
or refine the cross-lingual representations through parallel ontologies. In
this work, we enhance the transfer learning process by intermediate fine-tuning
of pretrained multilingual models, where the multilingual models are fine-tuned
with different but related data and/or tasks. Specifically, we use parallel and
conversational movie subtitles datasets to design cross-lingual intermediate
tasks suitable for downstream dialogue tasks. We use only 200K lines of
parallel data for intermediate fine-tuning which is already available for 1782
language pairs. We test our approach on the cross-lingual dialogue state
tracking task for the parallel MultiWoZ (English -> Chinese, Chinese ->
English) and Multilingual WoZ (English -> German, English -> Italian) datasets.
We achieve impressive improvements (> 20% on joint goal accuracy) on the
parallel MultiWoZ dataset and the Multilingual WoZ dataset over the vanilla
baseline with only 10% of the target language task data and zero-shot setup
respectively.Comment: EMNLP 2021 Camera Read
GL-CLeF: A Global-Local Contrastive Learning Framework for Cross-lingual Spoken Language Understanding
Due to high data demands of current methods, attention to zero-shot
cross-lingual spoken language understanding (SLU) has grown, as such approaches
greatly reduce human annotation effort. However, existing models solely rely on
shared parameters, which can only perform implicit alignment across languages.
We present Global--Local Contrastive Learning Framework (GL-CLeF) to address
this shortcoming. Specifically, we employ contrastive learning, leveraging
bilingual dictionaries to construct multilingual views of the same utterance,
then encourage their representations to be more similar than negative example
pairs, which achieves to explicitly aligned representations of similar
sentences across languages. In addition, a key step in GL-CLeF is a proposed
Local and Global component, which achieves a fine-grained cross-lingual
transfer (i.e., sentence-level Local intent transfer, token-level Local slot
transfer, and semantic-level Global transfer across intent and slot).
Experiments on MultiATIS++ show that GL-CLeF achieves the best performance and
successfully pulls representations of similar sentences across languages
closer.Comment: Accepted at ACL2022 Main Conferenc
On the Importance of Word Order Information in Cross-lingual Sequence Labeling
Word order variances generally exist in different languages. In this paper,
we hypothesize that cross-lingual models that fit into the word order of the
source language might fail to handle target languages. To verify this
hypothesis, we investigate whether making models insensitive to the word order
of the source language can improve the adaptation performance in target
languages. To do so, we reduce the source language word order information
fitted to sequence encoders and observe the performance changes. In addition,
based on this hypothesis, we propose a new method for fine-tuning multilingual
BERT in downstream cross-lingual sequence labeling tasks. Experimental results
on dialogue natural language understanding, part-of-speech tagging, and named
entity recognition tasks show that reducing word order information fitted to
the model can achieve better zero-shot cross-lingual performance. Furthermore,
our proposed methods can also be applied to strong cross-lingual baselines, and
improve their performances.Comment: Accepted in AAAI-202
- …