Search CORE

1,633 research outputs found

Likelihood-Maximizing-Based Multiband Spectral Subtraction for Robust Speech Recognition

Author: Bagher BabaAli
Hossein Sameti
Mehran Safayani
Publication venue: Springer Nature
Publication date: 01/01/2009
Field of study

Automatic speech recognition performance degrades significantly when speech is affected by environmental noise. Nowadays, the major challenge is to achieve good robustness in adverse noisy conditions so that automatic speech recognizers can be used in real situations. Spectral subtraction (SS) is a well-known and effective approach; it was originally designed for improving the quality of speech signal judged by human listeners. SS techniques usually improve the quality and intelligibility of speech signal while speech recognition systems need compensation techniques to reduce mismatch between noisy speech features and clean trained acoustic model. Nevertheless, correlation can be expected between speech quality improvement and the increase in recognition accuracy. This paper proposes a novel approach for solving this problem by considering SS and the speech recognizer not as two independent entities cascaded together, but rather as two interconnected components of a single system, sharing the common goal of improved speech recognition accuracy. This will incorporate important information of the statistical models of the recognition engine as a feedback for tuning SS parameters. By using this architecture, we overcome the drawbacks of previously proposed methods and achieve better recognition accuracy. Experimental evaluations show that the proposed method can achieve significant improvement of recognition rates across a wide range of signal to noise ratios

Springer - Publisher Connector

Directory of Open Access Journals

Uncertainty in Signal Estimation and Stochastic Weighted Viterbi Algorithm: A Unified Framework to Address Robustness in Speech Recognition and Speaker Verification

Author: C. Garreton
C. Molina
F. Huenupan
N. Becerra Yoma
Publication venue: 'IntechOpen'
Publication date: 01/06/2007
Field of study

IntechOpen

Dereverberation and Denoising Techniques for ASR Applications

Author: Fernando Santana Pacheco
Rui Seara
Publication venue: 'IntechOpen'
Publication date: 01/11/2008
Field of study

IntechOpen

Crossref

ASR Systems in Noisy Environment: Analysis and Solutions for Increasing Noise Robustness

Author: Pollak P.
Rajnoha J.
Publication venue: Společnost pro radioelektronické inženýrství
Publication date: 01/04/2011
Field of study

This paper deals with the analysis of Automatic Speech Recognition (ASR) suitable for usage within noisy environment and suggests optimum configuration under various noisy conditions. The behavior of standard parameterization techniques was analyzed from the viewpoint of robustness against background noise. It was done for Melfrequency cepstral coefficients (MFCC), Perceptual linear predictive (PLP) coefficients, and their modified forms combining main blocks of PLP and MFCC. The second part is devoted to the analysis and contribution of modified techniques containing frequency-domain noise suppression and voice activity detection. The above-mentioned techniques were tested with signals in real noisy environment within Czech digit recognition task and AURORA databases. Finally, the contribution of special VAD selective training and MLLR adaptation of acoustic models were studied for various signal features

Directory of Open Access Journals

Digital library of Brno University of Technology

Studies on noise robust automatic speech recognition

Author: Kurimo Mikko
Palomäki Kalle J.
Remes Ulpu
Publication venue: Teknillinen korkeakoulu
Publication date: 01/01/2009
Field of study

Noise in everyday acoustic environments such as cars, traffic environments, and cafeterias remains one of the main challenges in automatic speech recognition (ASR). As a research theme, it has received wide attention in conferences and scientific journals focused on speech technology. This article collection reviews both the classic and novel approaches suggested for noise robust ASR. The articles are literature reviews written for the spring 2009 seminar course on noise robust automatic speech recognition (course code T-61.6060) held at TKK

Aaltodoc Publication Archive

Speech Recognition in Unknown Noisy Conditions

Author: Baochun Hou
Ji Ming
Publication venue: 'IntechOpen'
Publication date: 01/06/2007
Field of study

IntechOpen

Normalization and Transformation Techniques for Robust Speaker Recognition

Author: Baojie Li
Dalei Wu
Hui Jiang
Publication venue: 'IntechOpen'
Publication date: 01/11/2008
Field of study

IntechOpen

Crossref

Exploration and Optimization of Noise Reduction Algorithms for Speech Recognition in Embedded Devices

Author: Setiawan Panji
Publication venue: Universität der Bundeswehr München, Fakultät für Elektrotechnik und Informationstechnik
Publication date: 01/01/2009
Field of study

Environmental noise present in real-life applications substantially degrades the performance of speech recognition systems. An example is an in-car scenario where a speech recognition system has to support the man-machine interface. Several sources of noise coming from the engine, wipers, wheels etc., interact with speech. Special challenge is given in an open window scenario, where noise of traffic, park noise, etc., has to be regarded. The main goal of this thesis is to improve the performance of a speech recognition system based on a state-of-the-art hidden Markov model (HMM) using noise reduction methods. The performance is measured with respect to word error rate and with the method of mutual information. The noise reduction methods are based on weighting rules. Least-squares weighting rules in the frequency domain have been developed to enable a continuous development based on the existing system and also to guarantee its low complexity and footprint for applications in embedded devices. The weighting rule parameters are optimized employing a multidimensional optimization task method of Monte Carlo followed by a compass search method. Root compression and cepstral smoothing methods have also been implemented to boost the recognition performance. The additional complexity and memory requirements of the proposed system are minimum. The performance of the proposed system was compared to the European Telecommunications Standards Institute (ETSI) standardized system. The proposed system outperforms the ETSI system by up to 8.6 % relative increase in word accuracy and achieves up to 35.1 % relative increase in word accuracy compared to the existing baseline system on the ETSI Aurora 3 German task. A relative increase of up to 18 % in word accuracy over the existing baseline system is also obtained from the proposed weighting rules on large vocabulary databases. An entropy-based feature vector analysis method has also been developed to assess the quality of feature vectors. The entropy estimation is based on the histogram approach. The method has the advantage to objectively asses the feature vector quality regardless of the acoustic modeling assumption used in the speech recognition system

Universität der Bundeswehr München: AtheneForschung