Search CORE

2 research outputs found

Shared-hidden-layer Deep Neural Network for Under-resourced Language the Content

Author: Hoesen Devin
Lestari Dessi Puji
Widyantoro Dwi Hendratmo
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/06/2018
Field of study

Training speech recognizer with under-resourced language data still proves difficult. Indonesian language is considered under-resourced because the lack of a standard speech corpus, text corpus, and dictionary. In this research, the efficacy of augmenting limited Indonesian speech training data with highly-resourced-language training data, such as English, to train Indonesian speech recognizer was analyzed. The training was performed in form of shared-hidden-layer deep-neural-network (SHL-DNN) training. An SHL-DNN has language-independent hidden layers and can be pre-trained and trained using multilingual training data without any difference with a monolingual deep neural network. The SHL-DNN using Indonesian and English speech training data proved effective for decreasing word error rate (WER) in decoding Indonesian dictated-speech by achieving 3.82% absolute decrease compared to a monolingual Indonesian hidden Markov model using Gaussian mixture model emission (GMM-HMM). The case was confirmed when the SHL-DNN was also employed to decode Indonesian spontaneous-speech by achieving 4.19% absolute WER decrease

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Under-resourced speech recognition based on the speech manifold

Author: de Wet Febe
Sahraeian Reza
Van Compernolle Dirk
Publication venue: ISCA-INT SPEECH COMMUNICATION ASSOC
Publication date: 01/01/2015
Field of study

Copyright © 2015 ISCA. Conventional acoustic modeling involves estimating many parameters to effectively model feature distributions. The sparseness of speech and text data, however, degrades the reliability of the estimation process and makes speech recognition a challenging task. In this paper, we propose to use a nonlinear feature transformation based on the speech manifold called Intrinsic Spectral Analysis (ISA) for under-resourced speech recognition. First, we investigate the usefulness of ISA features in low resource scenarios for both Gaussian mixture and deep neural network (DNN) acoustic modeling. Moreover, due to the connection of ISA features to the articulatory configuration space, this feature space is potentially less language dependent than other typical spectral-based features, and therefore exploiting out-of-language data in this feature space is beneficial. We demonstrate the positive effect of ISA in the frame work of multilingual DNN systems where Flemish and Afrikaans are used as donor and under-resourced target languages respectively. We compare the performance of ISA with conventional features in both multilingual and under-resourced monolingual conditions.Sahraeian R., Van Compernolle D., de Wet F., ''Under-resourced speech recognition based on the speech manifold'', Proceedings 16th annual conference of the International Speech Communication Association (ISCA) - Interspeech 2015, pp. 1255-1259, September 6-10, 2015, Dresden, Germany.status: publishe

Lirias