11 research outputs found

    Using generalized maxout networks and phoneme mapping for low resource ASR– a case study on Flemish-Afrikaans

    Get PDF
    © 2015 IEEE. Recently, multilingual deep neural networks (DNNs) have been successfully used to improve under-resourced speech recognizers. Common approaches use either a merged universal phoneme set based on the International Phonetic Alphabet (IPA) or a language specific phoneme set to train a multilingual DNN. In this paper, we investigate the effect of both knowledge-based and data-driven phoneme mapping on the multilingual DNN and its application to an under-resourced language. For the data-driven phoneme mapping we propose to use an approximation of Kullback Leibler Divergence (KLD) to generate a confusion matrix and find the best matching phonemes of the target language for each individual phoneme in the donor language. Moreover, we explore the use of recently proposed generalized maxout network in both multilingual and low resource monolingual scenarios. We evaluate the proposed phoneme mappings on a phoneme recognition task with both HMM/GMM and DNN systems with generalized maxout architecture where Flemish and Afrikaans are used as donor and under-resourced target languages respectively.Sahraeian R., Van Compernolle D., de Wet F., ''Using generalized maxout networks and phoneme mapping for low resource ASR– a case study on Flemish-Afrikaans'', Proceedings 2015 Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference (PRASA-RobMech), pp. 112-117, November 26-27, 2015, Port Elizabeth, South Africa.status: publishe

    Under-resourced speech recognition based on the speech manifold

    No full text
    Copyright © 2015 ISCA. Conventional acoustic modeling involves estimating many parameters to effectively model feature distributions. The sparseness of speech and text data, however, degrades the reliability of the estimation process and makes speech recognition a challenging task. In this paper, we propose to use a nonlinear feature transformation based on the speech manifold called Intrinsic Spectral Analysis (ISA) for under-resourced speech recognition. First, we investigate the usefulness of ISA features in low resource scenarios for both Gaussian mixture and deep neural network (DNN) acoustic modeling. Moreover, due to the connection of ISA features to the articulatory configuration space, this feature space is potentially less language dependent than other typical spectral-based features, and therefore exploiting out-of-language data in this feature space is beneficial. We demonstrate the positive effect of ISA in the frame work of multilingual DNN systems where Flemish and Afrikaans are used as donor and under-resourced target languages respectively. We compare the performance of ISA with conventional features in both multilingual and under-resourced monolingual conditions.Sahraeian R., Van Compernolle D., de Wet F., ''Under-resourced speech recognition based on the speech manifold'', Proceedings 16th annual conference of the International Speech Communication Association (ISCA) - Interspeech 2015, pp. 1255-1259, September 6-10, 2015, Dresden, Germany.status: publishe

    Speech recognition for under-resourced languages: Data sharing in hidden Markov model systems

    Get PDF
    © 2017. The Author(s). For purposes of automated speech recognition in under-resourced environments, techniques used to share acoustic data between closely related or similar languages become important. Donor languages with abundant resources can potentially be used to increase the recognition accuracy of speech systems developed in the resource poor target language. The assumption is that adding more data will increase the robustness of the statistical estimations captured by the acoustic models. In this study we investigated data sharing between Afrikaans and Flemish - an under-resourced and well-resourced language, respectively. Our approach was focused on the exploration of model adaptation and refinement techniques associated with hidden Markov model based speech recognition systems to improve the benefit of sharing data. Specifically, we focused on the use of currently available techniques, some possible combinations and the exact utilisation of the techniques during the acoustic model development process. Our findings show that simply using normal approaches to adaptation and refinement does not result in any benefits when adding Flemish data to the Afrikaans training pool. The only observed improvement was achieved when developing acoustic models on all available data but estimating model refinements and adaptations on the target data only.de Wet F., Kleynhans N., Van Compernolle D., Sahraeian R., ''Speech recognition for under-resourced languages: Data sharing in hidden Markov model systems'', South African journal of science, vol. 113, no. 1/2, 9 pp., January/February 2017.status: publishe

    Knowledge-based phoneme mapping between Flemish and Afrikaans

    No full text
    Sahraeian R., Kleynhans N., de Wet F., Van Compernolle D., ''Knowledge-based phoneme mapping between Flemish and Afrikaans'', Technical report KUL/ESAT/PSI/1502, KU Leuven, ESAT, January 2015, Leuven, Belgium.status: publishe

    Speech recognition for under-resourced languages: Data sharing in hidden Markov model systems

    No full text
    For purposes of automated speech recognition in under-resourced environments, techniques used to share acoustic data between closely related or similar languages become important. Donor languages with abundant resources can potentially be used to increase the recognition accuracy of speech systems developed in the resource poor target language. The assumption is that adding more data will increase the robustness of the statistical estimations captured by the acoustic models. In this study we investigated data sharing between Afrikaans and Flemish – an under-resourced and well-resourced language, respectively. Our approach was focused on the exploration of model adaptation and refinement techniques associated with hidden Markov model based speech recognition systems to improve the benefit of sharing data. Specifically, we focused on the use of currently available techniques, some possible combinations and the exact utilisation of the techniques during the acoustic model development process. Our findings show that simply using normal approaches to adaptation and refinement does not result in any benefits when adding Flemish data to the Afrikaans training pool. The only observed improvement was achieved when developing acoustic models on all available data but estimating model refinements and adaptations on the target data only. Significance:  Acoustic modelling for under-resourced languages Automatic speech recognition for Afrikaans Data sharing between Flemish and Afrikaans to improve acoustic modelling for Afrikaan
    corecore