1 research outputs found
Analysis of Multilingual Sequence-to-Sequence speech recognition systems
This paper investigates the applications of various multilingual approaches
developed in conventional hidden Markov model (HMM) systems to
sequence-to-sequence (seq2seq) automatic speech recognition (ASR). On a set
composed of Babel data, we first show the effectiveness of multi-lingual
training with stacked bottle-neck (SBN) features. Then we explore various
architectures and training strategies of multi-lingual seq2seq models based on
CTC-attention networks including combinations of output layer, CTC and/or
attention component re-training. We also investigate the effectiveness of
language-transfer learning in a very low resource scenario when the target
language is not included in the original multi-lingual training data.
Interestingly, we found multilingual features superior to multilingual models,
and this finding suggests that we can efficiently combine the benefits of the
HMM system with the seq2seq system through these multilingual feature
techniques.Comment: arXiv admin note: text overlap with arXiv:1810.0345