6,969 research outputs found
Modelling of Sound Events with Hidden Imbalances Based on Clustering and Separate Sub-Dictionary Learning
This paper proposes an effective modelling of sound event spectra with a
hidden data-size-imbalance, for improved Acoustic Event Detection (AED). The
proposed method models each event as an aggregated representation of a few
latent factors, while conventional approaches try to find acoustic elements
directly from the event spectra. In the method, all the latent factors across
all events are assigned comparable importance and complexity to overcome the
hidden imbalance of data-sizes in event spectra. To extract latent factors in
each event, the proposed method employs clustering and performs non-negative
matrix factorization to each latent factor, and learns its acoustic elements as
a sub-dictionary. Separate sub-dictionary learning effectively models the
acoustic elements with limited data-sizes and avoids over-fitting due to hidden
imbalances in training data. For the task of polyphonic sound event detection
from DCASE 2013 challenge, an AED based on the proposed modelling achieves a
detection F-measure of 46.5%, a significant improvement of more than 19% as
compared to the existing state-of-the-art methods
Musical instrument classification using non-negative matrix factorization algorithms
In this paper, a class of algorithms for automatic classification of individual musical instrument sounds is presented. Several perceptual features used in general sound classification applications were measured for 300 sound recordings consisting of 6 different musical instrument classes (piano, violin, cello, flute, bassoon and soprano saxophone). In addition, MPEG-7 basic spectral and spectral basis descriptors were considered, providing an effective combination for accurately describing the spectral and timbrai audio characteristics. The audio flies were split using 70% of the available data for training and the remaining 30% for testing. A classifier was developed based on non-negative matrix factorization (NMF) techniques, thus introducing a novel application of NMF. The standard NMF method was examined, as well as its modifications: the local, the sparse, and the discriminant NMF. Experimental results are presented to compare MPEG-7 spectral basis representations with MPEG-7 basic spectral features alongside the various NMF algorithms. The results indicate that the use of the spectrum projection coefficients for feature extraction and the standard NMF classifier yields an accuracy exceeding 95%. Ā©2006 IEEE
- ā¦