Search CORE

6 research outputs found

A Subband-Based SVM Front-End for Robust ASR

Author: Ager Matthew
Cvetkovic Zoran
Sollich Peter
Yousafzai Jibran
Publication venue
Publication date: 24/12/2013
Field of study

This work proposes a novel support vector machine (SVM) based robust automatic speech recognition (ASR) front-end that operates on an ensemble of the subband components of high-dimensional acoustic waveforms. The key issues of selecting the appropriate SVM kernels for classification in frequency subbands and the combination of individual subband classifiers using ensemble methods are addressed. The proposed front-end is compared with state-of-the-art ASR front-ends in terms of robustness to additive noise and linear filtering. Experiments performed on the TIMIT phoneme classification task demonstrate the benefits of the proposed subband based SVM front-end: it outperforms the standard cepstral front-end in the presence of noise and linear filtering for signal-to-noise ratio (SNR) below 12-dB. A combination of the proposed front-end with a conventional front-end such as MFCC yields further improvements over the individual front ends across the full range of noise levels

arXiv.org e-Print Archive

King's Research Portal

Overcoming HMM Time and Parameter Independence Assumptions for ASR

Author: Jos&#233
Marta Casar
Publication venue: 'IntechOpen'
Publication date: 01/11/2008
Field of study

Postprint (published version

IntechOpen

Crossref

UPCommons. Portal del coneixement obert de la UPC

Discriminative classifiers with adaptive kernels for noise robust speech recognition

Author: Acero
Burges
Deng
F. Flego
Huang
Huang
Jaakkola
Kuo
M.J.F. Gales
Smith
Vapnik
Wu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Speech Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

Directory of Open Access Books (DOAB)

Augmented statistical models for speech recognition

Author: Gales MJF
Layton MI
Publication venue
Publication date: 01/01/2006
Field of study

CUED - Cambridge University Engineering Department