14,698 research outputs found

    Speech Enhancement Strategy for Speech Recognition Microcontroller under Noisy Environments

    Get PDF
    Industrial automation with speech control functions is generally installed with a speech recognition sensor which is used as an interface for users to articulate speech commands. However, recognition errors are likely to be produced when background noise surrounds the command spoken into the speech recognition microcontrollers. In this paper, a speech enhancement strategy is proposed to develop noise suppression filters in order to improve the accuracy of speech recognition microcontrollers. It uses a universal estimator, namely a neural network, to enhance the recognition accuracy of microcontrollers by integrating better signals processed by various noise suppression filters, where a global optimization algorithm, namely an intelligent particle swarm optimization, is used to optimize the inbuilt parameters of the neural network in order to maximize accuracy of speech recognition microcontrollers working within noisy environments. The proposed approach overcomes the limitations of the existing noise suppression filters intended to improve recognition accuracy. The performance of the proposed approach was evaluated by a speech recognition microcontroller, which is used in electronic products with speech control functions. Results show that the accuracy of the speech recognition microcontroller can be improved using the proposed approach, when working under low signal to noise ratio conditions in the industrial environments of automobile engines and factory machines

    Multi-Resolution Fully Convolutional Neural Networks for Monaural Audio Source Separation

    Get PDF
    In deep neural networks with convolutional layers, each layer typically has fixed-size/single-resolution receptive field (RF). Convolutional layers with a large RF capture global information from the input features, while layers with small RF size capture local details with high resolution from the input features. In this work, we introduce novel deep multi-resolution fully convolutional neural networks (MR-FCNN), where each layer has different RF sizes to extract multi-resolution features that capture the global and local details information from its input features. The proposed MR-FCNN is applied to separate a target audio source from a mixture of many audio sources. Experimental results show that using MR-FCNN improves the performance compared to feedforward deep neural networks (DNNs) and single resolution deep fully convolutional neural networks (FCNNs) on the audio source separation problem.Comment: arXiv admin note: text overlap with arXiv:1703.0801
    corecore