2 research outputs found

    Efficient Scalable Encoding for Distributed Speech Recognition

    No full text
    In this paper the remote speech recognition problem is addressed. Speech features are extracted at a client and transmitted to a remote recognizer. This enables a low complexity client, which does not have the computational and memory resources to host a complex speech recognizer, to make use of distributed resources to provide speech recognition services to the user. The novelties of the proposed work are (i) the extracted features are compressed using scalable encoding techniques providing a multi-resolution bitstream, (ii) a complete scalable distributed speech recognition (DSR) system is presented wherein the proposed scalable encoding technique is combined with a scalable recognition system. The scalable DSR system provides successive approximation in terms of recognition performance, (i.e., as additional bits are transmitted the recognition can be refined to improve the performance) and achieves both bandwidth and complexity (latency) reductions. The proposed encoding schemes are well suited to be implemented on light-weight mobile devices where varying ambient conditions and limited computational capabilities pose a severe constraint in achieving good recognition performance. The scalable DSR system is capable of adapting to the varying network, system and user constraints by operating at the "right" trade-off point between transmission rate, recognition performance and complexity to provide good quality of service (QoS) to the user. The system was tested using two case studies. In the first, the scalable encoder along with a dynamic time warping-hidden Markov model (DTW-HMM) system reduced the recognition complexity by 25% compared to a system using only a HMM, with no degradation in word error rate (WER). In the second study, a distributed two-..
    corecore