1 research outputs found

    Large Vocabulary Children’s Speech Recognition with DNN-HMM and SGMM Acoustic Modeling

    No full text
    In this paper, large vocabulary children’s speech recognition is investigated by using the Deep Neural Network - Hidden Markov Model (DNN-HMM) hybrid and the Subspace Gaussian Mixture Model (SGMM) acoustic modeling approach. In the investigated scenario training data is limited to about 7 hours of speech from children in the age range 7-13 and testing data consists in read clean speech from children in the same age range. To tackle inter-speaker acoustic variability, speaker adaptive training, based on feature space maximum likelihood linear regression, as well as vocal tract length normalization are adopted. Experimental results show that with both DNNHMM and SGMM systems very good recognition results can be achieved although best results are obtained with the DNNHMM system
    corecore