Search CORE

351 research outputs found

Combined Acoustic and Pronunciation Modelling for Non-Native Speech Recognition

Author: Bouselmi Ghazi
Fohr Dominique
Illina Irina
Publication venue
Publication date: 27/08/2007
Field of study

In this paper, we present several adaptation methods for non-native speech recognition. We have tested pronunciation modelling, MLLR and MAP non-native pronunciation adaptation and HMM models retraining on the HIWIRE foreign accented English speech database. The ``phonetic confusion'' scheme we have developed consists in associating to each spoken phone several sequences of confused phones. In our experiments, we have used different combinations of acoustic models representing the canonical and the foreign pronunciations: spoken and native models, models adapted to the non-native accent with MAP and MLLR. The joint use of pronunciation modelling and acoustic adaptation led to further improvements in recognition accuracy. The best combination of the above mentioned techniques resulted in a relative word error reduction ranging from 46% to 71%

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HTK - Tutorial (Part I + II)

Author: Schiel Florian
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 01/01/1997
Field of study

Open Access LMU

Knowledge Resources in Automatic Speech Recognition and Understanding for Romanian Language

Author: Corneliu Octavian Dumitru
Diana Mihaela Militaru
Inge Gavat
Publication venue: 'IntechOpen'
Publication date: 01/11/2008
Field of study

IntechOpen

Crossref

Automatic Speech Segmentation Based on HMM

Author: Kroul M.
Publication venue: Společnost pro radioelektronické inženýrství
Publication date: 01/01/2007
Field of study

This contribution deals with the problem of automatic phoneme segmentation using HMMs. Automatization of speech segmentation task is important for applications, where large amount of data is needed to process, so manual segmentation is out of the question. In this paper we focus on automatic segmentation of recordings, which will be used for triphone synthesis unit database creation. For speech synthesis, the speech unit quality is a crucial aspect, so the maximal accuracy in segmentation is needed here. In this work, different kinds of HMMs with various parameters have been trained and their usefulness for automatic segmentation is discussed. At the end of this work, some segmentation accuracy tests of all models are presented

Directory of Open Access Journals

DSpace@TUL

Digital library of Brno University of Technology

Speech Recognition on an FPGA Using Discrete and Continuous Hidden Markov Models

Author: Melnikoff Stephen Jonathan
Quigley Steven Francis
Russell Martin
Publication venue: Springer Verlag
Publication date: 01/01/2002
Field of study

Speech recognition is a computationally demanding task, particularly the stage which uses Viterbi decoding for converting pre-processed speech data into words or sub-word units. Any device that can reduce the load on, for example, a PC’s processor, is advantageous. Hence we present FPGA implementations of the decoder based alternately on discrete and continuous hidden Markov models (HMMs) representing monophones, and demonstrate that the discrete version can process speech nearly 5,000 times real time, using just 12% of the slices of a Xilinx Virtex XCV1000, but with a lower recognition rate than the continuous implementation, which is 75 times faster than real time, and occupies 45% of the same device

University of Birmingham Research Portal

Implementing a simple continuous speech recognition system on an FPGA

Author: Melnikoff Stephen Jonathan
Quigley Steven Francis
Russell Martin
Publication venue: IEEE
Publication date: 01/01/2002
Field of study

Speech recognition is a computationally demanding task, particularly the stage which uses Viterbi decoding for converting pre-processed speech data into words or sub-word units. We present an FPGA implementations of the decoder based on continuous hidden Markov models (HMMs) representing monophones, and demonstrate that it can process speech 75 times real time, using 45% of the slices of a Xilinx Virtex XCV100

University of Birmingham Research Portal

Improving large vocabulary continuous speech recognition by combining GMM-based and reservoir-based acoustic modeling

Author: Demuynck Kris
Martens Jean-Pierre
Triefenbach Fabian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

In earlier work we have shown that good phoneme recognition is possible with a so-called reservoir, a special type of recurrent neural network. In this paper, different architectures based on Reservoir Computing (RC) for large vocabulary continuous speech recognition are investigated. Besides experiments with HMM hybrids, it is shown that a RC-HMM tandem can achieve the same recognition accuracy as a classical HMM, which is a promising result for such a fairly new paradigm. It is also demonstrated that a state-level combination of the scores of the tandem and the baseline HMM leads to a significant improvement over the baseline. A word error rate reduction of the order of 20\% relative is possible

Crossref

Ghent University Academic Bibliography

Language identification through parallel phone recognition dc by Christine S. Chou.

Author: Chou Christine S. (Christine Susan)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1994
Field of study

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1994.Includes bibliographical references (leaves 32-33).M.Eng

DSpace@MIT