14,698 research outputs found
Speech Enhancement Strategy for Speech Recognition Microcontroller under Noisy Environments
Industrial automation with speech control functions is generally installed with a speech recognition sensor which is used as an interface for users to articulate speech commands. However, recognition errors are likely to be produced when background noise surrounds the command spoken into the speech recognition microcontrollers. In this paper, a speech enhancement strategy is proposed to develop noise suppression filters in order to improve the accuracy of speech recognition microcontrollers. It uses a universal estimator, namely a neural network, to enhance the recognition accuracy of microcontrollers by integrating better signals processed by various noise suppression filters, where a global optimization algorithm, namely an intelligent particle swarm optimization, is used to optimize the inbuilt parameters of the neural network in order to maximize accuracy of speech recognition microcontrollers working within noisy environments. The proposed approach overcomes the limitations of the existing noise suppression filters intended to improve recognition accuracy. The performance of the proposed approach was evaluated by a speech recognition microcontroller, which is used in electronic products with speech control functions. Results show that the accuracy of the speech recognition microcontroller can be improved using the proposed approach, when working under low signal to noise ratio conditions in the industrial environments of automobile engines and factory machines
Multi-Resolution Fully Convolutional Neural Networks for Monaural Audio Source Separation
In deep neural networks with convolutional layers, each layer typically has
fixed-size/single-resolution receptive field (RF). Convolutional layers with a
large RF capture global information from the input features, while layers with
small RF size capture local details with high resolution from the input
features. In this work, we introduce novel deep multi-resolution fully
convolutional neural networks (MR-FCNN), where each layer has different RF
sizes to extract multi-resolution features that capture the global and local
details information from its input features. The proposed MR-FCNN is applied to
separate a target audio source from a mixture of many audio sources.
Experimental results show that using MR-FCNN improves the performance compared
to feedforward deep neural networks (DNNs) and single resolution deep fully
convolutional neural networks (FCNNs) on the audio source separation problem.Comment: arXiv admin note: text overlap with arXiv:1703.0801
- …