14,557 research outputs found
Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection
Sound events often occur in unstructured environments where they exhibit wide
variations in their frequency content and temporal structure. Convolutional
neural networks (CNN) are able to extract higher level features that are
invariant to local spectral and temporal variations. Recurrent neural networks
(RNNs) are powerful in learning the longer term temporal context in the audio
signals. CNNs and RNNs as classifiers have recently shown improved performances
over established methods in various sound recognition tasks. We combine these
two approaches in a Convolutional Recurrent Neural Network (CRNN) and apply it
on a polyphonic sound event detection task. We compare the performance of the
proposed CRNN method with CNN, RNN, and other established methods, and observe
a considerable improvement for four different datasets consisting of everyday
sound events.Comment: Accepted for IEEE Transactions on Audio, Speech and Language
Processing, Special Issue on Sound Scene and Event Analysi
Multi-Resolution Fully Convolutional Neural Networks for Monaural Audio Source Separation
In deep neural networks with convolutional layers, each layer typically has
fixed-size/single-resolution receptive field (RF). Convolutional layers with a
large RF capture global information from the input features, while layers with
small RF size capture local details with high resolution from the input
features. In this work, we introduce novel deep multi-resolution fully
convolutional neural networks (MR-FCNN), where each layer has different RF
sizes to extract multi-resolution features that capture the global and local
details information from its input features. The proposed MR-FCNN is applied to
separate a target audio source from a mixture of many audio sources.
Experimental results show that using MR-FCNN improves the performance compared
to feedforward deep neural networks (DNNs) and single resolution deep fully
convolutional neural networks (FCNNs) on the audio source separation problem.Comment: arXiv admin note: text overlap with arXiv:1703.0801
- …