Search CORE

5 research outputs found

Robust Distributed Speech Recognition Using Auditory Modelling

Author: Flynn Ronan
Jones Edward
Publication venue: 'IntechOpen'
Publication date: 28/11/2012
Field of study

Towards improving the robustness of distributed speech recognition in packet loss

Author: Alastair James
Basagni
Ben Milner
Bernard
Boulis
Cooke
Furui
Halsall
Peinado
Raj
Ramsey
Vaseghi
Wesolowski
Publication venue: 'Elsevier BV'
Publication date: 11/08/2006
Field of study

This work addresses the problem of achieving robust distributed speech recognition (DSR) performance in the presence of packet loss. The nature of packet loss is analysed by examining packet loss data gathered from a GSM mobile data channel. This analysis is then used to examine the effect of realistic packet loss conditions on DSR systems, and shows that the accuracy of DSR is more sensitive to burst-like packet loss rather than the actual number of lost packets. This leads to the design of a three-stage packet loss compensation scheme. First, interleaving is applied to the transmitted feature vectors to disperse bursts of packet loss. Second, lost feature vectors are reconstructed prior to recognition using a variety of reconstruction techniques. Third, a weighted-Viterbi decoding method is applied to the recogniser itself, which modifies the contribution of the reconstructed feature vectors according to the accuracy of their reconstruction. Experimental results on both a connected digits task and a large-vocabulary task show that simple methods, such as repetition, are not as effective as interpolation methods. Best performance is given by a novel maximum a posteriori (MAP) estimation, which utilizes temporal statistics of the feature vector stream. This reconstruction method is then combined with weighted-Viterbi decoding, using a novel method to calculate the confidences of reconstructed static and temporal components separately. Using interleaving, results improve significantly, and it is shown that a limited level of interleaving can be applied without increasing the delay to the end-user. Using a combination of these techniques for the connected digits task, word accuracy is increased from 49.5% to 95.3% even with a packet loss rate of 50% and average burst length of 20 feature vectors

Crossref

University of East Anglia digital repository

Towards improving the robustness of distributed speech recognition in packet loss

Author: James Alastair
Milner Ben
Publication venue
Publication date: 01/01/2004
Field of study

University of East Anglia digital repository

TOWARDS IMPROVING THE ROBUSTNESS OF DISTRIBUTED SPEECH RECOGNITION IN PACKET LOSS

Author: Alastair James
Ben Milner
Publication venue
Publication date: 01/01/2004
Field of study

This work begins with an analysis into the effect of packet loss on the temporal components of the feature vector stream and its subsequent effect on recognition accuracy. Two methods of packet loss compensation are then compared. Reconstruction methods begin with interpolation and are extended to include prior statistical knowledge of the feature vector stream in the form of MAP estimation of lost vectors. Application of missing feature theory is also used to compensate for packet loss in the decoding phase of recognition. The feature vector is considered in terms of three temporal components, static, velocity and acceleration, and the reliability of these considered individually. Finally interleaving techniques are applied to reduce the perceived average burst lengths. Experimental results are then presented on the ETSI Aurora connected digit database. 1

CiteSeerX

University of East Anglia digital repository