12,863 research outputs found
Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection
Sound events often occur in unstructured environments where they exhibit wide
variations in their frequency content and temporal structure. Convolutional
neural networks (CNN) are able to extract higher level features that are
invariant to local spectral and temporal variations. Recurrent neural networks
(RNNs) are powerful in learning the longer term temporal context in the audio
signals. CNNs and RNNs as classifiers have recently shown improved performances
over established methods in various sound recognition tasks. We combine these
two approaches in a Convolutional Recurrent Neural Network (CRNN) and apply it
on a polyphonic sound event detection task. We compare the performance of the
proposed CRNN method with CNN, RNN, and other established methods, and observe
a considerable improvement for four different datasets consisting of everyday
sound events.Comment: Accepted for IEEE Transactions on Audio, Speech and Language
Processing, Special Issue on Sound Scene and Event Analysi
Recognition of Harmonic Sounds in Polyphonic Audio using a Missing Feature Approach: Extended Report
A method based on local spectral features and missing feature techniques
is proposed for the recognition of harmonic sounds in mixture
signals. A mask estimation algorithm is proposed for identifying
spectral regions that contain reliable information for each sound
source and then bounded marginalization is employed to treat the
feature vector elements that are determined as unreliable. The proposed
method is tested on musical instrument sounds due to the
extensive availability of data but it can be applied on other sounds
(i.e. animal sounds, environmental sounds), whenever these are harmonic.
In simulations the proposed method clearly outperformed a
baseline method for mixture signals
Automatic Environmental Sound Recognition: Performance versus Computational Cost
In the context of the Internet of Things (IoT), sound sensing applications
are required to run on embedded platforms where notions of product pricing and
form factor impose hard constraints on the available computing power. Whereas
Automatic Environmental Sound Recognition (AESR) algorithms are most often
developed with limited consideration for computational cost, this article seeks
which AESR algorithm can make the most of a limited amount of computing power
by comparing the sound classification performance em as a function of its
computational cost. Results suggest that Deep Neural Networks yield the best
ratio of sound classification accuracy across a range of computational costs,
while Gaussian Mixture Models offer a reasonable accuracy at a consistently
small cost, and Support Vector Machines stand between both in terms of
compromise between accuracy and computational cost
Inferring Room Semantics Using Acoustic Monitoring
Having knowledge of the environmental context of the user i.e. the knowledge
of the users' indoor location and the semantics of their environment, can
facilitate the development of many of location-aware applications. In this
paper, we propose an acoustic monitoring technique that infers semantic
knowledge about an indoor space \emph{over time,} using audio recordings from
it. Our technique uses the impulse response of these spaces as well as the
ambient sounds produced in them in order to determine a semantic label for
them. As we process more recordings, we update our \emph{confidence} in the
assigned label. We evaluate our technique on a dataset of single-speaker human
speech recordings obtained in different types of rooms at three university
buildings. In our evaluation, the confidence\emph{ }for the true label
generally outstripped the confidence for all other labels and in some cases
converged to 100\% with less than 30 samples.Comment: 2017 IEEE International Workshop on Machine Learning for Signal
Processing, Sept.\ 25--28, 2017, Tokyo, Japa
Classification of damage in structural systems using time series analysis and supervised and unsupervised pattern recognition techniques
Peer reviewedPostprin
- …