Search CORE

105 research outputs found

A minimax search algorithm for robust continuous speech recognition

Author: Hirose K
Huo Q
Jiang H
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2000
Field of study

In this paper, we propose a novel implementation of a minimax decision rule for continuous density hidden Markov-model-based robust speech recognition. By combining the idea of the minimax decision rule with a normal Viterbi search, we derive a recursive minimax search algorithm, where the minimax decision rule is repetitively applied to determine the partial paths during the search procedure. Because of the intrinsic nature of a recursive search, the proposed method can be easily extended to perform continuous speech recognition. Experimental results on Japanese isolated digits and TIDIGITS, where the mismatch between training and testing conditions is caused by additive white Gaussian noise, show the viability and efficiency of the proposed minimax search algorithm.published_or_final_versio

CiteSeerX

HKU Scholars Hub

Study of noise robustness of First Formant Bandwidth (F1BW) method

Author: M.P Paulraj
Mohd Yusof Shahrul Azmi
Nazri Ahmad
Siraj Fadzilah
Yaacob S.
Publication venue
Publication date: 01/01/2011
Field of study

The performance of speech recognition application under adverse noisy condition often becomes the topic of researchers regardless of the language used. Applications that use vowel phonemes require high degree of Standard Malay vowel recognition capability.In Malaysia, researches in vowel recognition is still lacking especially in the usage of Malay vowels, independent speaker systems, recognition robustness and algorithm speed and accuracy. This paper presents a noise robustness study on an improved vowel feature extraction method called First Formant Bandwidth (F1BW) on three classifiers of Multinomial Logistic Regression (MLR), K-Nearest Neighbors (k-NN) and Linear Discriminant Analysis (LDA).Results show that LDA performs best in overall vowel classification compared to MLR and KNN in terms of robustness capability

UUM Repository

Recommended from our members

Scene Analysis for Speech and Audio Recognition

Author: Ellis Daniel P. W.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2003
Field of study

Focuses on several different approaches to handling sound mixtures: computational auditory scene analysis, multicondition training, and parallel-model-based techniques such as HMM decomposition and multisource decoding

Columbia University Academic Commons

On the Use of a Multilingual Neural Network Front-End

Author: Fissore L.
Gemello R.
Laface Pietro
Mana F.
Scanzio Stefano
Publication venue: ISCA
Publication date: 01/01/2008
Field of study

PORTO Publications Open Repository TOrino

Parallel model combination and word recognition in soccer audio

Author: Jackson PJB
Longton JH
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

The audio scene from broadcast soccer can be used for identifying highlights from the game. Audio cues derived from these sources provide valuable information about game events, as can the detection of key words used by the commentators. In this paper we interpret the feasibility of incorporating both commentator word recognition and information about the additive background noise in an HMM structure. A limited set of audio cues, which have been extracted from data collected from the 2006 FIFA World Cup, are used to create an extension to the Aurora-2 database. The new database is then tested with various PMC models and compared to the standard baseline, clean and multi-condition training methods. It is found that incorporating SNR and noise type information into the PMC process is beneficial to recognition performance

Crossref

University of Surrey

Surrey Research Insight

On the Use of a Multilingual Neural Network Front-End

Author: FISSORE L
GEMELLO R
LAFACE Pietro
MANA F.
SCANZIO STEFANO
Publication venue: 'The International Fiscal Association of Korea'
Publication date: 01/01/2008
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PLASER: Pronunciation Learning via Automatic Speech Recognition

Author: Brian Mak
Brian Mak Manhung
Fong-ho Chong
Jacqueline Lo
Jimmy Wong
Ka-yee Leung
Kin-wah Chan
Manhung Siu
Mimi Ng
Simon Ho
Yik-cheung Tam
Yu-chung Chan
Publication venue
Publication date: 01/01/2003
Field of study

PLASER is a multimedia tool with instant feedback designed to teach English pronunciation for high-school students of Hong Kong whose mother tongue is Cantonese Chinese. The objective is to teach correct pronunciation and not to assess a student's overall pronunciation quality. Major challenges related to speech recognition technology include: allowance for non-native accent, reliable and corrective feedbacks, and visualization of errors

CiteSeerX

Hong Kong University of Science and Technology Institutional Repository