Search CORE

444 research outputs found

Sequence-discriminative training of deep neural networks

Author: Burget Lukás
Ghoshal Arnab
Povey Daniel
Veselý Karel
Publication venue
Publication date: 01/08/2013
Field of study

Automatic speech recognition with deep neural networks for impaired speech

Author: España-i-Bonet Cristina
Rodríguez Fonollosa José Adrián
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The final publication is available at https://link.springer.com/chapter/10.1007%2F978-3-319-49169-1_10Automatic Speech Recognition has reached almost human performance in some controlled scenarios. However, recognition of impaired speech is a difficult task for two main reasons: data is (i) scarce and (ii) heterogeneous. In this work we train different architectures on a database of dysarthric speech. A comparison between architectures shows that, even with a small database, hybrid DNN-HMM models outperform classical GMM-HMM according to word error rate measures. A DNN is able to improve the recognition word error rate a 13% for subjects with dysarthria with respect to the best classical architecture. This improvement is higher than the one given by other deep neural networks such as CNNs, TDNNs and LSTMs. All the experiments have been done with the Kaldi toolkit for speech recognition for which we have adapted several recipes to deal with dysarthric speech and work on the TORGO database. These recipes are publicly available.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Support Vector Machines for Speech Recognition

Author: Ganapathiraju Aravind
Publication venue: Scholars Junction
Publication date: 01/01/2002
Field of study

Hidden Markov models (HMM) with Gaussian mixture observation densities are the dominant approach in speech recognition. These systems typically use a representational model for acoustic modeling which can often be prone to overfitting and does not translate to improved discrimination. We propose a new paradigm centered on principles of structural risk minimization using a discriminative framework for speech recognition based on support vector machines (SVMs). SVMs have the ability to simultaneously optimize the representational and discriminative ability of the acoustic classifiers. We have developed the first SVM-based large vocabulary speech recognition system that improves performance over traditional HMM-based systems. This hybrid system achieves a state-of-the-art word error rate of 10.6% on a continuous alphadigit task ? a 10% improvement relative to an HMM system. On SWITCHBOARD, a large vocabulary task, the system improves performance over a traditional HMM system from 41.6% word error rate to 40.6%. This dissertation discusses several practical issues that arise when SVMs are incorporated into the hybrid system

CiteSeerX

Mississippi State University Libraries ETD database

Scholars Junction - Mississippi State University Institutional Repository

A new model-discriminant training algorithm for hybrid NN-HMM systems

Author: Caspary P.
Reichl W.
Ruske G.
Publication venue: Sonstige Einrichtungen. DFKI Deutsches Forschungszentrum für Künstliche Intelligenz
Publication date: 01/01/1996
Field of study

This paper describes a hybrid system for continuous speech recognition consisting of a neural network (NN) and a hidden Markov model (HMM). The system is based on a multilayer perceptron, which approximates the a-posteriori probability of a sequence of states, derived from semi-continuous hidden Markov models. The classification is based on a total score for each hybrid model, attained from a Viterbi search on the state probabilities. Due to the unintended discrimination between the states in each model, a new training algorithm for the hybrid neural networks is presented. The utilized error function approximates the misclassification rate of the hybrid system. The discriminance between the correct and the incorrect models is optimized during the training by the "Generalized Probabilistic Descent Algorithm\u27;, resulting in a minimum classification error. No explicit target values for the neural net output nodes are used, as in the usual backpropagation algorithm with a quadratic error function. In basic experiments up to 56% recognition rate were achieved on a vowel classification task and up to 69 % on a consonant cluster classification task

Universaar

Acronym

An Efficient Hidden Markov Model for Offline Handwritten Numeral Recognition

Author: Hemanth S.
Saritha B. S.
Publication venue
Publication date: 01/01/2009
Field of study

Traditionally, the performance of ocr algorithms and systems is based on the recognition of isolated characters. When a system classifies an individual character, its output is typically a character label or a reject marker that corresponds to an unrecognized character. By comparing output labels with the correct labels, the number of correct recognition, substitution errors misrecognized characters, and rejects unrecognized characters are determined. Nowadays, although recognition of printed isolated characters is performed with high accuracy, recognition of handwritten characters still remains an open problem in the research arena. The ability to identify machine printed characters in an automated or a semi automated manner has obvious applications in numerous fields. Since creating an algorithm with a one hundred percent correct recognition rate is quite probably impossible in our world of noise and different font styles, it is important to design character recognition algorithms with these failures in mind so that when mistakes are inevitably made, they will at least be understandable and predictable to the person working with theComment: 6pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX