103 research outputs found
On the Importance of Word Order Information in Cross-lingual Sequence Labeling
Word order variances generally exist in different languages. In this paper,
we hypothesize that cross-lingual models that fit into the word order of the
source language might fail to handle target languages. To verify this
hypothesis, we investigate whether making models insensitive to the word order
of the source language can improve the adaptation performance in target
languages. To do so, we reduce the source language word order information
fitted to sequence encoders and observe the performance changes. In addition,
based on this hypothesis, we propose a new method for fine-tuning multilingual
BERT in downstream cross-lingual sequence labeling tasks. Experimental results
on dialogue natural language understanding, part-of-speech tagging, and named
entity recognition tasks show that reducing word order information fitted to
the model can achieve better zero-shot cross-lingual performance. Furthermore,
our proposed methods can also be applied to strong cross-lingual baselines, and
improve their performances.Comment: Accepted in AAAI-202
On the cross-lingual transferability of multilingual prototypical models across NLU tasks
Supervised deep learning-based approaches have been applied to task-oriented
dialog and have proven to be effective for limited domain and language
applications when a sufficient number of training examples are available. In
practice, these approaches suffer from the drawbacks of domain-driven design
and under-resourced languages. Domain and language models are supposed to grow
and change as the problem space evolves. On one hand, research on transfer
learning has demonstrated the cross-lingual ability of multilingual
Transformers-based models to learn semantically rich representations. On the
other, in addition to the above approaches, meta-learning have enabled the
development of task and language learning algorithms capable of far
generalization. Through this context, this article proposes to investigate the
cross-lingual transferability of using synergistically few-shot learning with
prototypical neural networks and multilingual Transformers-based models.
Experiments in natural language understanding tasks on MultiATIS++ corpus shows
that our approach substantially improves the observed transfer learning
performances between the low and the high resource languages. More generally
our approach confirms that the meaningful latent space learned in a given
language can be can be generalized to unseen and under-resourced ones using
meta-learning.Comment: Accepted to the ACL workshop METANLP 202
Leveraging Multilingual Self-Supervised Pretrained Models for Sequence-to-Sequence End-to-End Spoken Language Understanding
A number of methods have been proposed for End-to-End Spoken Language
Understanding (E2E-SLU) using pretrained models, however their evaluation often
lacks multilingual setup and tasks that require prediction of lexical fillers,
such as slot filling. In this work, we propose a unified method that integrates
multilingual pretrained speech and text models and performs E2E-SLU on six
datasets in four languages in a generative manner, including the prediction of
lexical fillers. We investigate how the proposed method can be improved by
pretraining on widely available speech recognition data using several training
objectives. Pretraining on 7000 hours of multilingual data allows us to
outperform the state-of-the-art ultimately on two SLU datasets and partly on
two more SLU datasets. Finally, we examine the cross-lingual capabilities of
the proposed model and improve on the best known result on the
PortMEDIA-Language dataset by almost half, achieving a Concept/Value Error Rate
of 23.65%.Comment: IEEE Workshop on Automatic Speech Recognition and Understanding
(ASRU) 202
Transfer-Free Data-Efficient Multilingual Slot Labeling
Slot labeling (SL) is a core component of task-oriented dialogue (ToD)
systems, where slots and corresponding values are usually language-, task- and
domain-specific. Therefore, extending the system to any new
language-domain-task configuration requires (re)running an expensive and
resource-intensive data annotation process. To mitigate the inherent data
scarcity issue, current research on multilingual ToD assumes that sufficient
English-language annotated data are always available for particular tasks and
domains, and thus operates in a standard cross-lingual transfer setup. In this
work, we depart from this often unrealistic assumption. We examine challenging
scenarios where such transfer-enabling English annotated data cannot be
guaranteed, and focus on bootstrapping multilingual data-efficient slot
labelers in transfer-free scenarios directly in the target languages without
any English-ready data. We propose a two-stage slot labeling approach (termed
TWOSL) which transforms standard multilingual sentence encoders into effective
slot labelers. In Stage 1, relying on SL-adapted contrastive learning with only
a handful of SL-annotated examples, we turn sentence encoders into
task-specific span encoders. In Stage 2, we recast SL from a token
classification into a simpler, less data-intensive span classification task.
Our results on two standard multilingual TOD datasets and across diverse
languages confirm the effectiveness and robustness of TWOSL. It is especially
effective for the most challenging transfer-free few-shot setups, paving the
way for quick and data-efficient bootstrapping of multilingual slot labelers
for ToD
- …