Search CORE

283 research outputs found

Connectionist probability estimators in HMM speech recognition

Author: Bourlard Herve
Cohen Michael
Franco Horacio
Morgan Nelson
Renals Steve
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1994
Field of study

The authors are concerned with integrating connectionist networks into a hidden Markov model (HMM) speech recognition system. This is achieved through a statistical interpretation of connectionist networks as probability estimators. They review the basis of HMM speech recognition and point out the possible benefits of incorporating connectionist networks. Issues necessary to the construction of a connectionist HMM recognition system are discussed, including choice of connectionist probability estimator. They describe the performance of such a system using a multilayer perceptron probability estimator evaluated on the speaker-independent DARPA Resource Management database. In conclusion, they show that a connectionist component improves a state-of-the-art HMM system

Edinburgh Research Archive

Sequence-discriminative training of deep neural networks

Author: Burget Lukás
Ghoshal Arnab
Povey Daniel
Veselý Karel
Publication venue
Publication date: 01/08/2013
Field of study

Edinburgh Research Explorer

Efficient training algorithms for HMMs using incremental estimation

Author: Gotoh Y.
Hochberg M.M.
Silverman H.F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1998
Field of study

Typically, parameter estimation for a hidden Markov model (HMM) is performed using an expectation-maximization (EM) algorithm with the maximum-likelihood (ML) criterion. The EM algorithm is an iterative scheme that is well-defined and numerically stable, but convergence may require a large number of iterations. For speech recognition systems utilizing large amounts of training material, this results in long training times. This paper presents an incremental estimation approach to speed-up the training of HMMs without any loss of recognition performance. The algorithm selects a subset of data from the training set, updates the model parameters based on the subset, and then iterates the process until convergence of the parameters. The advantage of this approach is a substantial increase in the number of iterations of the EM algorithm per training token, which leads to faster training. In order to achieve reliable estimation from a small fraction of the complete data set at each iteration, two training criteria are studied; ML and maximum a posteriori (MAP) estimation. Experimental results show that the training of the incremental algorithms is substantially faster than the conventional (batch) method and suffers no loss of recognition performance. Furthermore, the incremental MAP based training algorithm improves performance over the batch versio

Crossref

White Rose Research Online

A lecture transcription system combining neural network acoustic and language models

Author: Bell P
Hori C
McInnes F
Renals S
Swietojanski P
Wu Y
Yamamoto H
Publication venue
Publication date: 01/01/2013
Field of study

Edinburgh Research Explorer

High-dimensional sequence transduction

Author: Bengio Yoshua
Boulanger-Lewandowski Nicolas
Vincent Pascal
Publication venue
Publication date: 09/12/2012
Field of study

We investigate the problem of transforming an input sequence into a high-dimensional output sequence in order to transcribe polyphonic audio music into symbolic notation. We introduce a probabilistic model based on a recurrent neural network that is able to learn realistic output distributions given the input and we devise an efficient algorithm to search for the global mode of that distribution. The resulting method produces musically plausible transcriptions even under high levels of noise and drastically outperforms previous state-of-the-art approaches on five datasets of synthesized sounds and real recordings, approximately halving the test error rate

arXiv.org e-Print Archive

CiteSeerX

Crossref

Phone deactivation pruning in large vocabulary continuous speech recognition

Author: Renals Steve
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1996
Field of study

In this letter, we introduce a new pruning strategy for large vocabulary continuous speech recognition based on direct estimates of local posterior phone probabilities. This approach is well suited to hybrid connectionist/hidden Markov model systems. Experiments on the Wall Street Journal task using a 20000 word vocabulary and a trigram language model have demonstrated that phone deactivation pruning can increase the speed of recognition-time search by up to a factor of 10, with a relative increase in error rate of less than 2%

Edinburgh Research Archive

Hybrid HMM/ANN Systems for Speech Recognition: Overview and New Research Directions

Author: Bourlard Hervé
Morgan Nelson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/03/2006
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Whole Word Phonetic Displays for Speech Articulation Training

Author: Meng Fansheng
Publication venue: ODU Digital Commons
Publication date: 01/04/2006
Field of study

The main objective of this dissertation is to investigate and develop speech recognition technologies for speech training for people with hearing impairments. During the course of this work, a computer aided speech training system for articulation speech training was also designed and implemented. The speech training system places emphasis on displays to improve children\u27s pronunciation of isolated Consonant-Vowel-Consonant (CVC) words, with displays at both the phonetic level and whole word level. This dissertation presents two hybrid methods for combining Hidden Markov Models (HMMs) and Neural Networks (NNs) for speech recognition. The first method uses NN outputs as posterior probability estimators for HMMs. The second method uses NNs to transform the original speech features to normalized features with reduced correlation. Based on experimental testing, both of the hybrid methods give higher accuracy than standard HMM methods. The second method, using the NN to create normalized features, outperforms the first method in terms of accuracy. Several graphical displays were developed to provide real time visual feedback to users, to help them to improve and correct their pronunciations

Old Dominion University

Activity Recognition Using Hybrid Generative/Discriminative Models on Home Environments Using Binary Sensors

Author: Ordóñez Morales Francisco Javier
Sanchis de Miguel María Araceli
Toledo Heras María Paula de
Publication venue: 'MDPI AG'
Publication date: 01/01/2013
Field of study

Activities of daily living are good indicators of elderly health status, and activity recognition in smart environments is a well-known problem that has been previously addressed by several studies. In this paper, we describe the use of two powerful machine learning schemes, ANN (Artificial Neural Network) and SVM (Support Vector Machines), within the framework of HMM (Hidden Markov Model) in order to tackle the task of activity recognition in a home setting. The output scores of the discriminative models, after processing, are used as observation probabilities of the hybrid approach. We evaluate our approach by comparing these hybrid models with other classical activity recognition methods using five real datasets. We show how the hybrid models achieve significantly better recognition performance, with significance level p<0 : 0 5, proving that the hybrid approach is better suited for the addressed domain.This work has been supported by the Ambient Assisted Living Programme (Joint Initiative by the European Commission and EU Member States) under the Trainutri (Training and nutrition senior social platform) Project (AAL-2009-2-129) and by the Spanish Government under i-Support (Intelligent Agent Based Driver Decision Support) Project (TRA2011-29454-C03-03)

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo