Feature extraction using non-linear transformation for robust speech recognition on the Aurora database

Ellis, Daniel P. W.; Hermansky, Hynek; Jain, Pratibha; Kajarekar, Sachin; Sharma, Sangita

Feature extraction using non-linear transformation for robust speech recognition on the Aurora database

Authors: Daniel P. W. Ellis
Hynek Hermansky
Pratibha Jain
Sachin Kajarekar
Sangita Sharma
Publication date: 1 January 2000
Publisher: 'Columbia University Libraries/Information Services'
Doi

Abstract

We evaluate the performance of several feature sets on the Aurora task as defined by ETSI. We show that after a non-linear transformation, a number of features can be effectively used in a HMM-based recognition system. The non-linear transformation is computed using a neural network which is discriminatively trained on the phonetically labeled (forcibly aligned) training data. A combination of the non-linearly transformed PLP (perceptive linear predictive coefficients), MSG (modulation filtered spectrogram) and TRAP (temporal pattern) features yields a 63% improvement in error rate as compared to baseline me frequency cepstral coefficients features. The use of the non-linearly transformed RASTA-like features, with system parameters scaled down to take into account the ETSI imposed memory and latency constraints, still yields a 40% improvement in error rate

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Sustaining member

Columbia University Academic Commons

oai:academiccommons.columbia.e...

Last time updated on 02/10/2018