42 research outputs found

    On the efficient representation and execution of deep acoustic models

    Full text link
    In this paper we present a simple and computationally efficient quantization scheme that enables us to reduce the resolution of the parameters of a neural network from 32-bit floating point values to 8-bit integer values. The proposed quantization scheme leads to significant memory savings and enables the use of optimized hardware instructions for integer arithmetic, thus significantly reducing the cost of inference. Finally, we propose a "quantization aware" training process that applies the proposed scheme during network training and find that it allows us to recover most of the loss in accuracy introduced by quantization. We validate the proposed techniques by applying them to a long short-term memory-based acoustic model on an open-ended large vocabulary speech recognition task.Comment: Accepted conference paper: "The Annual Conference of the International Speech Communication Association (Interspeech), 2016

    Android Based Smart Speech Recognition Application to Perform Various Tasks

    Get PDF
    A smart speech recognition application is theorized in this paper. User can control a variety of applications on an android based platform, which include native applications as well as user installed applications with voice commands. These include - calling, texting, switching on and off sensors (Wi-Fi, GPS, Bluetooth),setting alarms. The application provides online as well as offline services. The application also applies machine learning concepts to identify usage patterns and create an environment which anticipates user requirements. The tasks being performed repetitively are automated. Services of activity recognition, recognizing nearby friends using Bluetooth are performed. The importance of the project is that it provides visually challenged people as well as the general population an alternate and a very easy way to control applications on android smart phones

    Intrinsic sparse LSTM using structured targeted dropout for efficient hardware inference

    Full text link
    Recurrent Neural Networks (RNNs) are useful for speech recognition but their fully-connected structure leads to a large memory footprint, making it difficult to deploy them on resource-constrained embedded systems. Previous structured RNN pruning methods can effectively reduce RNN size; however, it is difficult to find a good balance between high sparsity and high task accuracy or the pruned models only lead to moderate speedup on custom hardware accelerators. This work proposes a novel structured pruning method called Structure Targeted Dropout (STD)-Intrinsic Sparse Structures (ISS) that stochastically drops grouped rows and columns of the weight matrices during training. The compressed networks are equivalent to a smaller dense network, which can be efficiently processed by Graphics Processing Units (GPUs). STD-ISS is evaluated on the TIMIT phone recognition task using Long Short-Term Memory (LSTM) RNNs. It outperforms previous state-of-the-art hardware-friendly methods on both accuracy and compression ratio. STD-ISS achieves a size compression ratio of up to 50× with <1% accuracy loss, leading to a 19.1× speedup on the embedded Jetson Xavier NX GPU platform
    corecore