361 research outputs found
Improved one-class SVM classifier for sounds classification
©2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.International audienceThis paper proposes to apply optimized One-Class Support Vector Machines (1-SVMs) as a discriminative framework in order to address a specific audio classification problem. First, since SVM-based classifier with gaussian RBF kernel is sensitive to the kernel width, the width will be scaled in a distribution-dependent way permitting to avoid underfitting and over-fitting problems. Moreover, an advanced dissimilarity measure will be introduced. We illustrate the performance of these methods on an audio database containing environmental sounds that may be of great importance for surveillance and security applications. The experiments conducted on a multi-class problem show that by choosing adequately the SVM parameters, we can efficiently address a sounds classification problem characterized by complex real-world datasets
Learning sound representations using trainable COPE feature extractors
Sound analysis research has mainly been focused on speech and music
processing. The deployed methodologies are not suitable for analysis of sounds
with varying background noise, in many cases with very low signal-to-noise
ratio (SNR). In this paper, we present a method for the detection of patterns
of interest in audio signals. We propose novel trainable feature extractors,
which we call COPE (Combination of Peaks of Energy). The structure of a COPE
feature extractor is determined using a single prototype sound pattern in an
automatic configuration process, which is a type of representation learning. We
construct a set of COPE feature extractors, configured on a number of training
patterns. Then we take their responses to build feature vectors that we use in
combination with a classifier to detect and classify patterns of interest in
audio signals. We carried out experiments on four public data sets: MIVIA audio
events, MIVIA road events, ESC-10 and TU Dortmund data sets. The results that
we achieved (recognition rate equal to 91.71% on the MIVIA audio events, 94% on
the MIVIA road events, 81.25% on the ESC-10 and 94.27% on the TU Dortmund)
demonstrate the effectiveness of the proposed method and are higher than the
ones obtained by other existing approaches. The COPE feature extractors have
high robustness to variations of SNR. Real-time performance is achieved even
when the value of a large number of features is computed.Comment: Accepted for publication in Pattern Recognitio
K-Space at TRECVid 2007
In this paper we describe K-Space participation in
TRECVid 2007. K-Space participated in two tasks, high-level feature extraction and interactive search. We present our approaches for each of these activities and provide a brief analysis of our results. Our high-level feature submission utilized multi-modal low-level features which included visual, audio and temporal elements. Specific concept detectors (such as Face detectors) developed by K-Space partners were also used. We experimented with different machine learning approaches including logistic regression and support vector machines (SVM). Finally we also experimented with both early and late fusion for feature combination. This year we also participated in interactive search, submitting 6 runs. We developed two interfaces which both utilized the same retrieval functionality. Our objective was to measure the effect of context, which was supported to different degrees in each interface, on user performance.
The first of the two systems was a ‘shot’ based interface,
where the results from a query were presented as a ranked
list of shots. The second interface was ‘broadcast’ based,
where results were presented as a ranked list of broadcasts.
Both systems made use of the outputs of our high-level feature submission as well as low-level visual features
Mosquito Detection with Neural Networks: The Buzz of Deep Learning
Many real-world time-series analysis problems are characterised by scarce
data. Solutions typically rely on hand-crafted features extracted from the time
or frequency domain allied with classification or regression engines which
condition on this (often low-dimensional) feature vector. The huge advances
enjoyed by many application domains in recent years have been fuelled by the
use of deep learning architectures trained on large data sets. This paper
presents an application of deep learning for acoustic event detection in a
challenging, data-scarce, real-world problem. Our candidate challenge is to
accurately detect the presence of a mosquito from its acoustic signature. We
develop convolutional neural networks (CNNs) operating on wavelet
transformations of audio recordings. Furthermore, we interrogate the network's
predictive power by visualising statistics of network-excitatory samples. These
visualisations offer a deep insight into the relative informativeness of
components in the detection problem. We include comparisons with conventional
classifiers, conditioned on both hand-tuned and generic features, to stress the
strength of automatic deep feature learning. Detection is achieved with
performance metrics significantly surpassing those of existing algorithmic
methods, as well as marginally exceeding those attained by individual human
experts.Comment: For data and software related to this paper, see
http://humbug.ac.uk/kiskin2017/. Submitted as a conference paper to ECML 201
Livestock vocalisation classification in farm soundscapes
Livestock vocalisations have been shown to contain information related to animal welfare and behaviour. Automated sound detection has the potential to facilitate a continuous acoustic monitoring system, for use in a range Precision Livestock Farming (PLF) applications. There are few examples of automated livestock vocalisation classification algorithms, and we have found none capable of being easily adapted and applied to different species' vocalisations. In this work, a multi-purpose livestock vocalisation classification algorithm is presented, utilising audio-specific feature extraction techniques, and machine learning models. To test the multi-purpose nature of the algorithm, three separate data sets were created targeting livestock-related vocalisations, namely sheep, cattle, and Maremma sheepdogs. Audio data was extracted from continuous recordings conducted on-site at three different operational farming enterprises, reflecting the conditions of real deployment. A comparison of Mel-Frequency Cepstral Coefficients (MFCCs) and Discrete Wavelet Transform-based (DWT) features was conducted. Classification was determined using a Support Vector Machine (SVM) model. High accuracy was achieved for all data sets (sheep: 99.29%, cattle: 95.78%, dogs: 99.67%). Classification performance alone was insufficient to determine the most suitable feature extraction method for each data set. Computational timing results revealed the DWT-based features to be markedly faster to produce (14.81 - 15.38% decrease in execution time). The results indicate the development of a highly accurate livestock vocalisation classification algorithm, which forms the foundation for an automated livestock vocalisation detection system
Identifying microphone from noisy recordings by using representative instance one class-classification approach
Rapid growth of technical developments has created huge challenges for microphone forensics - a subcategory of audio forensic science, because of the availability of numerous digital recording devices and massive amount of recording data. Demand for fast and efficient methods to assure integrity and authenticity of information is becoming more and more important in criminal investigation nowadays. Machine learning has emerged as an important technique to support audio analysis processes of microphone forensic practitioners. However, its application to real life situations using supervised learning is still facing great challenges due to expensiveness in collecting data and updating system. In this paper, we introduce a new machine learning approach which is called One-class Classification (OCC) to be applied to microphone forensics; we demonstrate its capability on a corpus of audio samples collected from several microphones. In addition, we propose a representative instance classification framework (RICF) that can effectively improve performance of OCC algorithms for recording signal with noise. Experiment results and analysis indicate that OCC has the potential to benefit microphone forensic practitioners in developing new tools and techniques for effective and efficient analysis. © 2012 Academy Publisher
- …