Search CORE

15 research outputs found

Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training

Author: Gong Yifan
Hu Yuxuan
Li Jinyu
Lin Edward
Liu Linquan
Liu Shujie
Sun Eric
Wang Peidong
Xue Jian
Zhou Long
Zhu Yimeng
Publication venue
Publication date: 07/07/2023
Field of study

We propose gated language experts and curriculum training to enhance multilingual transformer transducer models without requiring language identification (LID) input from users during inference. Our method incorporates a gating mechanism and LID loss, enabling transformer experts to learn language-specific information. By combining gated transformer experts with shared transformer layers, we construct multilingual transformer blocks and utilize linear experts to effectively regularize the joint network. The curriculum training scheme leverages LID to guide the gated experts in improving their respective language performance. Experimental results on a bilingual task involving English and Spanish demonstrate significant improvements, with average relative word error reductions of 12.5% and 7.3% compared to the baseline bilingual and monolingual models, respectively. Notably, our method achieves performance comparable to the upper-bound model trained and inferred with oracle LID. Extending our approach to trilingual, quadrilingual, and pentalingual models reveals similar advantages to those observed in the bilingual models, highlighting its ease of extension to multiple languages

arXiv.org e-Print Archive

Lightweight Adapter Tuning for Multilingual Speech Translation

Author: Besacier Laurent
Gu Jiatao
Le Hang
Pino Juan
Schwab Didier
Wang Changhan
Publication venue
Publication date: 12/07/2021
Field of study

Adapter modules were recently introduced as an efficient alternative to fine-tuning in NLP. Adapter tuning consists in freezing pretrained parameters of a model and injecting lightweight modules between layers, resulting in the addition of only a small number of task-specific trainable parameters. While adapter tuning was investigated for multilingual neural machine translation, this paper proposes a comprehensive analysis of adapters for multilingual speech translation (ST). Starting from different pre-trained models (a multilingual ST trained on parallel data or a multilingual BART (mBART) trained on non-parallel multilingual data), we show that adapters can be used to: (a) efficiently specialize ST to specific language pairs with a low extra cost in terms of parameters, and (b) transfer from an automatic speech recognition (ASR) task and an mBART pre-trained model to a multilingual ST task. Experiments show that adapter tuning offer competitive results to full fine-tuning, while being much more parameter-efficient.Comment: Accepted at ACL-IJCNLP 202

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes