351 research outputs found
Combined Acoustic and Pronunciation Modelling for Non-Native Speech Recognition
In this paper, we present several adaptation methods for non-native speech
recognition. We have tested pronunciation modelling, MLLR and MAP non-native
pronunciation adaptation and HMM models retraining on the HIWIRE foreign
accented English speech database. The ``phonetic confusion'' scheme we have
developed consists in associating to each spoken phone several sequences of
confused phones. In our experiments, we have used different combinations of
acoustic models representing the canonical and the foreign pronunciations:
spoken and native models, models adapted to the non-native accent with MAP and
MLLR. The joint use of pronunciation modelling and acoustic adaptation led to
further improvements in recognition accuracy. The best combination of the above
mentioned techniques resulted in a relative word error reduction ranging from
46% to 71%
Automatic Speech Segmentation Based on HMM
This contribution deals with the problem of automatic phoneme segmentation using HMMs. Automatization of speech segmentation task is important for applications, where large amount of data is needed to process, so manual segmentation is out of the question. In this paper we focus on automatic segmentation of recordings, which will be used for triphone synthesis unit database creation. For speech synthesis, the speech unit quality is a crucial aspect, so the maximal accuracy in segmentation is needed here. In this work, different kinds of HMMs with various parameters have been trained and their usefulness for automatic segmentation is discussed. At the end of this work, some segmentation accuracy tests of all models are presented
Speech Recognition on an FPGA Using Discrete and Continuous Hidden Markov Models
Speech recognition is a computationally demanding task, particularly the stage which uses Viterbi decoding for converting pre-processed speech data into words or sub-word units. Any device that can reduce the load on, for example, a PC’s processor, is advantageous. Hence we present FPGA implementations of the decoder based alternately on discrete and continuous hidden Markov models (HMMs) representing monophones, and demonstrate that the discrete version can process speech nearly 5,000 times real time, using just 12% of the slices of a Xilinx Virtex XCV1000, but with a lower recognition rate than the continuous implementation, which is 75 times faster than real time, and occupies 45% of the same device
Implementing a simple continuous speech recognition system on an FPGA
Speech recognition is a computationally demanding task, particularly the stage which uses Viterbi decoding for converting pre-processed speech data into words or sub-word units. We present an FPGA implementations of the decoder based on continuous hidden Markov models (HMMs) representing monophones, and demonstrate that it can process speech 75 times real time, using 45% of the slices of a Xilinx Virtex XCV100
Improving large vocabulary continuous speech recognition by combining GMM-based and reservoir-based acoustic modeling
In earlier work we have shown that good phoneme recognition is possible with a so-called reservoir, a special type of recurrent neural network. In this paper, different architectures based on Reservoir Computing (RC) for large vocabulary continuous speech recognition are investigated. Besides experiments with HMM hybrids, it is shown that a RC-HMM tandem can achieve the same recognition accuracy as a classical HMM, which is a promising result for such a fairly new paradigm. It is also demonstrated that a state-level combination of the scores of the tandem and the baseline HMM leads to a significant improvement over the baseline. A word error rate reduction of the order of 20\% relative is possible
Language identification through parallel phone recognition dc by Christine S. Chou.
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1994.Includes bibliographical references (leaves 32-33).M.Eng
- …