Search CORE

7,328 research outputs found

Improving large vocabulary continuous speech recognition by combining GMM-based and reservoir-based acoustic modeling

Author: Demuynck Kris
Martens Jean-Pierre
Triefenbach Fabian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

In earlier work we have shown that good phoneme recognition is possible with a so-called reservoir, a special type of recurrent neural network. In this paper, different architectures based on Reservoir Computing (RC) for large vocabulary continuous speech recognition are investigated. Besides experiments with HMM hybrids, it is shown that a RC-HMM tandem can achieve the same recognition accuracy as a classical HMM, which is a promising result for such a fairly new paradigm. It is also demonstrated that a state-level combination of the scores of the tandem and the baseline HMM leads to a significant improvement over the baseline. A word error rate reduction of the order of 20\% relative is possible

Crossref

Ghent University Academic Bibliography

Non-Native Pronunciation Variation Modeling for Automatic Speech Recognition

Author: Hong Kook Kim
Mina Kim
Yoo Rhee Oh
Publication venue: 'IntechOpen'
Publication date: 16/08/2010
Field of study

IntechOpen

Data Mining in Personalized Speech Disorders Therapy Optimization

Author: Danubianu Mirela
Stefan Gheorghe Pentiuc
Tobolcea Iolanda
Publication venue: 'IntechOpen'
Publication date: 21/01/2011
Field of study

IntechOpen

Articulatory features for robust visual speech recognition

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

Crossref

Phoneme and sentence-level ensembles for speech recognition

Author: Bengio Samy
Dimitrakakis Christos
Publication venue
Publication date: 01/01/2011
Field of study

We address the question of whether and how boosting and bagging can be used for speech recognition. In order to do this, we compare two different boosting schemes, one at the phoneme level and one at the utterance level, with a phoneme-level bagging scheme. We control for many parameters and other choices, such as the state inference scheme used. In an unbiased experiment, we clearly show that the gain of boosting methods compared to a single hidden Markov model is in all cases only marginal, while bagging significantly outperforms all other methods. We thus conclude that bagging methods, which have so far been overlooked in favour of boosting, should be examined more closely as a potentially useful ensemble learning technique for speech recognition

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

Chalmers Research

Hochschulschriftenserver - Universität Frankfurt am Main

An Analysis of HMM-based Prediction of Articulatory Movements

Author: Ling Zhen-Hua
Richmond Korin
Yamagishi Junichi
Publication venue: 'Elsevier BV'
Publication date: 01/10/2010
Field of study

Crossref

Edinburgh Research Explorer

Neural-Symbolic Temporal Decision Trees for Multivariate Time Series Classification

Author: Pagliarini Giovanni
Scaboro Simone
Sciavicco Guido
Serra Giuseppe
Stan Ionel Eduard
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 29th International Symposium on Temporal Representation and Reasoning (TIME 2022)
Publication date: 01/01/2022
Field of study

Multivariate time series classification is a widely known problem, and its applications are ubiquitous. Due to their strong generalization capability, neural networks have been proven to be very powerful for the task, but their applicability is often limited by their intrinsic black-box nature. Recently, temporal decision trees have been shown to be a serious alternative to neural networks for the same task in terms of classification performances, while attaining higher levels of transparency and interpretability. In this work, we propose an initial approach to neural-symbolic temporal decision trees, that is, an hybrid method that leverages on both the ability of neural networks of capturing temporal patterns and the flexibility of temporal decision trees of taking decisions on intervals based on (possibly, externally computed) temporal features. While based on a proof-of-concept implementation, in our experiments on public datasets, neural-symbolic temporal decision trees show promising results

Archivio istituzionale della ricerca - Università degli Studi di Udine

Dagstuhl Research Online Publication Server

Archivio istituzionale della ricerca - Università di Ferrara

The 5th Conference of PhD Students in Computer Science

Author
Publication venue
Publication date: 01/01/2006
Field of study

University of Szeged

MISPRONUNCIATION DETECTION AND DIAGNOSIS IN MANDARIN ACCENTED ENGLISH SPEECH

Author: Khanal Subash
Publication venue: UKnowledge
Publication date: 01/01/2020
Field of study

This work presents the development, implementation, and evaluation of a Mispronunciation Detection and Diagnosis (MDD) system, with application to pronunciation evaluation of Mandarin-accented English speech. A comprehensive detection and diagnosis of errors in the Electromagnetic Articulography corpus of Mandarin-Accented English (EMA-MAE) was performed by using the expert phonetic transcripts and an Automatic Speech Recognition (ASR) system. Articulatory features derived from the parallel kinematic data available in the EMA-MAE corpus were used to identify the most significant articulatory error patterns seen in L2 speakers during common mispronunciations. Using both acoustic and articulatory information, an ASR based Mispronunciation Detection and Diagnosis (MDD) system was built and evaluated across different feature combinations and Deep Neural Network (DNN) architectures. The MDD system captured mispronunciation errors with a detection accuracy of 82.4%, a diagnostic accuracy of 75.8% and a false rejection rate of 17.2%. The results demonstrate the advantage of using articulatory features in revealing the significant contributors of mispronunciation as well as improving the performance of MDD systems

University of Kentucky