Search CORE

676 research outputs found

Adjusted Viterbi training for hidden Markov models

Author: Koloydenko A.
Lember J.
Publication venue
Publication date: 01/01/2005
Field of study

To estimate the emission parameters in hidden Markov models one commonly uses the EM algorithm or its variation. Our primary motivation, however, is the Philips speech recognition system wherein the EM algorithm is replaced by the Viterbi training algorithm. Viterbi training is faster and computationally less involved than EM, but it is also biased and need not even be consistent. We propose an alternative to the Viterbi training -- adjusted Viterbi training -- that has the same order of computational complexity as Viterbi training but gives more accurate estimators. Elsewhere, we studied the adjusted Viterbi training for a special case of mixtures, supporting the theory by simulations. This paper proves the adjusted Viterbi training to be also possible for more general hidden Markov models.Comment: 45 pages, 2 figure

arXiv.org e-Print Archive

Adjusted Viterbi Training : a proof of concept

Author: Koloydenko A.
Lember J.
Publication venue: Eurandom
Publication date: 01/01/2005
Field of study

Repository TU/e

Pure OAI Repository

A minimax search algorithm for robust continuous speech recognition

Author: Hirose K
Huo Q
Jiang H
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2000
Field of study

In this paper, we propose a novel implementation of a minimax decision rule for continuous density hidden Markov-model-based robust speech recognition. By combining the idea of the minimax decision rule with a normal Viterbi search, we derive a recursive minimax search algorithm, where the minimax decision rule is repetitively applied to determine the partial paths during the search procedure. Because of the intrinsic nature of a recursive search, the proposed method can be easily extended to perform continuous speech recognition. Experimental results on Japanese isolated digits and TIDIGITS, where the mismatch between training and testing conditions is caused by additive white Gaussian noise, show the viability and efficiency of the proposed minimax search algorithm.published_or_final_versio

CiteSeerX

HKU Scholars Hub

Theory of Segmentation

Author: Alexey Koloydenko
Jüri Lember
Kristi Kuljus
Publication venue: 'IntechOpen'
Publication date: 01/04/2011
Field of study

IntechOpen

Royal Holloway Research Online

Accuracy of MAP segmentation with hidden Potts and Markov mesh prior models via Path Constrained Viterbi Training, Iterated Conditional Modes and Graph Cut based algorithms

Author: Baumgartner Josef
Flesia Ana Georgina
Gimenez Javier
Martinez Jorge
Publication venue
Publication date: 11/07/2013
Field of study

In this paper, we study statistical classification accuracy of two different Markov field environments for pixelwise image segmentation, considering the labels of the image as hidden states and solving the estimation of such labels as a solution of the MAP equation. The emission distribution is assumed the same in all models, and the difference lays in the Markovian prior hypothesis made over the labeling random field. The a priori labeling knowledge will be modeled with a) a second order anisotropic Markov Mesh and b) a classical isotropic Potts model. Under such models, we will consider three different segmentation procedures, 2D Path Constrained Viterbi training for the Hidden Markov Mesh, a Graph Cut based segmentation for the first order isotropic Potts model, and ICM (Iterated Conditional Modes) for the second order isotropic Potts model. We provide a unified view of all three methods, and investigate goodness of fit for classification, studying the influence of parameter estimation, computational gain, and extent of automation in the statistical measures Overall Accuracy, Relative Improvement and Kappa coefficient, allowing robust and accurate statistical analysis on synthetic and real-life experimental data coming from the field of Dental Diagnostic Radiography. All algorithms, using the learned parameters, generate good segmentations with little interaction when the images have a clear multimodal histogram. Suboptimal learning proves to be frail in the case of non-distinctive modes, which limits the complexity of usable models, and hence the achievable error rate as well. All Matlab code written is provided in a toolbox available for download from our website, following the Reproducible Research Paradigm

arXiv.org e-Print Archive

CiteSeerX

Whole Word Phonetic Displays for Speech Articulation Training

Author: Meng Fansheng
Publication venue: ODU Digital Commons
Publication date: 01/04/2006
Field of study

The main objective of this dissertation is to investigate and develop speech recognition technologies for speech training for people with hearing impairments. During the course of this work, a computer aided speech training system for articulation speech training was also designed and implemented. The speech training system places emphasis on displays to improve children\u27s pronunciation of isolated Consonant-Vowel-Consonant (CVC) words, with displays at both the phonetic level and whole word level. This dissertation presents two hybrid methods for combining Hidden Markov Models (HMMs) and Neural Networks (NNs) for speech recognition. The first method uses NN outputs as posterior probability estimators for HMMs. The second method uses NNs to transform the original speech features to normalized features with reduced correlation. Based on experimental testing, both of the hybrid methods give higher accuracy than standard HMM methods. The second method, using the NN to create normalized features, outperforms the first method in terms of accuracy. Several graphical displays were developed to provide real time visual feedback to users, to help them to improve and correct their pronunciations

Old Dominion University

Hidden Markov models with kernel density estimation of emission probabilities and their use in activity recognition

Author: Piccardi M
Pérez O
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/10/2007
Field of study

In this paper, we present a modified hidden Markov model with emission probabilities modelled by kernel density estimation and its use for activity recognition in videos. In the proposed approach, kernel density estimation of the emission probabilities is operated simultaneously with that of all the other model parameters by an adapted Baum-Welch algorithm. This allows us to retain maximum-likelihood estimation while overcoming the known limitations of mixture of Gaussions in modelling certain probability distributions. Experiments on activity recognition have been performed on groundtruthed data from the CAVIAR video surveillance database and reported in the paper. The error on the training and validation sets with kernel density estimation remains around 14-16% while for the conventional Gaussian mixture approach varies between 15 and 24%, strongly depending on the initial values chosen for the parameters. Overall, kernel density estimation proves capable of providing more flexible modelling of the emission probabilities and, unlike Gaussian mixtures, does not suffer from being highly parametric and of difficult initialisation. © 2007 IEEE

Crossref

OPUS - University of Technology Sydney