1 research outputs found
The NTNU Taiwanese ASR System for Formosa Speech Recognition Challenge 2020
This paper describes the NTNU ASR system participating in the Formosa Speech
Recognition Challenge 2020 (FSR-2020) supported by the Formosa Speech in the
Wild project (FSW). FSR-2020 aims at fostering the development of Taiwanese
speech recognition. Apart from the issues on tonal and dialectical variations
of the Taiwanese language, speech artificially contaminated with different
types of real-world noise also has to be dealt with in the final test stage;
all of these make FSR-2020 much more challenging than before. To work around
the under-resourced issue, the main technical aspects of our ASR system include
various deep learning techniques, such as transfer learning, semi-supervised
learning, front-end speech enhancement and model ensemble, as well as data
cleansing and data augmentation conducted on the training data. With the best
configuration, our system obtains 13.1 % syllable error rate (SER) on the
final-test set, achieving the first place among all participating systems on
Track 3.Comment: 17 pages, 3 figures, Accepted for publication in IJCLCL