361 research outputs found

    Improved one-class SVM classifier for sounds classification

    No full text
    ©2007 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.International audienceThis paper proposes to apply optimized One-Class Support Vector Machines (1-SVMs) as a discriminative framework in order to address a specific audio classification problem. First, since SVM-based classifier with gaussian RBF kernel is sensitive to the kernel width, the width will be scaled in a distribution-dependent way permitting to avoid underfitting and over-fitting problems. Moreover, an advanced dissimilarity measure will be introduced. We illustrate the performance of these methods on an audio database containing environmental sounds that may be of great importance for surveillance and security applications. The experiments conducted on a multi-class problem show that by choosing adequately the SVM parameters, we can efficiently address a sounds classification problem characterized by complex real-world datasets

    Learning sound representations using trainable COPE feature extractors

    Get PDF
    Sound analysis research has mainly been focused on speech and music processing. The deployed methodologies are not suitable for analysis of sounds with varying background noise, in many cases with very low signal-to-noise ratio (SNR). In this paper, we present a method for the detection of patterns of interest in audio signals. We propose novel trainable feature extractors, which we call COPE (Combination of Peaks of Energy). The structure of a COPE feature extractor is determined using a single prototype sound pattern in an automatic configuration process, which is a type of representation learning. We construct a set of COPE feature extractors, configured on a number of training patterns. Then we take their responses to build feature vectors that we use in combination with a classifier to detect and classify patterns of interest in audio signals. We carried out experiments on four public data sets: MIVIA audio events, MIVIA road events, ESC-10 and TU Dortmund data sets. The results that we achieved (recognition rate equal to 91.71% on the MIVIA audio events, 94% on the MIVIA road events, 81.25% on the ESC-10 and 94.27% on the TU Dortmund) demonstrate the effectiveness of the proposed method and are higher than the ones obtained by other existing approaches. The COPE feature extractors have high robustness to variations of SNR. Real-time performance is achieved even when the value of a large number of features is computed.Comment: Accepted for publication in Pattern Recognitio

    K-Space at TRECVid 2007

    Get PDF
    In this paper we describe K-Space participation in TRECVid 2007. K-Space participated in two tasks, high-level feature extraction and interactive search. We present our approaches for each of these activities and provide a brief analysis of our results. Our high-level feature submission utilized multi-modal low-level features which included visual, audio and temporal elements. Specific concept detectors (such as Face detectors) developed by K-Space partners were also used. We experimented with different machine learning approaches including logistic regression and support vector machines (SVM). Finally we also experimented with both early and late fusion for feature combination. This year we also participated in interactive search, submitting 6 runs. We developed two interfaces which both utilized the same retrieval functionality. Our objective was to measure the effect of context, which was supported to different degrees in each interface, on user performance. The first of the two systems was a ‘shot’ based interface, where the results from a query were presented as a ranked list of shots. The second interface was ‘broadcast’ based, where results were presented as a ranked list of broadcasts. Both systems made use of the outputs of our high-level feature submission as well as low-level visual features

    Mosquito Detection with Neural Networks: The Buzz of Deep Learning

    Full text link
    Many real-world time-series analysis problems are characterised by scarce data. Solutions typically rely on hand-crafted features extracted from the time or frequency domain allied with classification or regression engines which condition on this (often low-dimensional) feature vector. The huge advances enjoyed by many application domains in recent years have been fuelled by the use of deep learning architectures trained on large data sets. This paper presents an application of deep learning for acoustic event detection in a challenging, data-scarce, real-world problem. Our candidate challenge is to accurately detect the presence of a mosquito from its acoustic signature. We develop convolutional neural networks (CNNs) operating on wavelet transformations of audio recordings. Furthermore, we interrogate the network's predictive power by visualising statistics of network-excitatory samples. These visualisations offer a deep insight into the relative informativeness of components in the detection problem. We include comparisons with conventional classifiers, conditioned on both hand-tuned and generic features, to stress the strength of automatic deep feature learning. Detection is achieved with performance metrics significantly surpassing those of existing algorithmic methods, as well as marginally exceeding those attained by individual human experts.Comment: For data and software related to this paper, see http://humbug.ac.uk/kiskin2017/. Submitted as a conference paper to ECML 201

    Livestock vocalisation classification in farm soundscapes

    Get PDF
    Livestock vocalisations have been shown to contain information related to animal welfare and behaviour. Automated sound detection has the potential to facilitate a continuous acoustic monitoring system, for use in a range Precision Livestock Farming (PLF) applications. There are few examples of automated livestock vocalisation classification algorithms, and we have found none capable of being easily adapted and applied to different species' vocalisations. In this work, a multi-purpose livestock vocalisation classification algorithm is presented, utilising audio-specific feature extraction techniques, and machine learning models. To test the multi-purpose nature of the algorithm, three separate data sets were created targeting livestock-related vocalisations, namely sheep, cattle, and Maremma sheepdogs. Audio data was extracted from continuous recordings conducted on-site at three different operational farming enterprises, reflecting the conditions of real deployment. A comparison of Mel-Frequency Cepstral Coefficients (MFCCs) and Discrete Wavelet Transform-based (DWT) features was conducted. Classification was determined using a Support Vector Machine (SVM) model. High accuracy was achieved for all data sets (sheep: 99.29%, cattle: 95.78%, dogs: 99.67%). Classification performance alone was insufficient to determine the most suitable feature extraction method for each data set. Computational timing results revealed the DWT-based features to be markedly faster to produce (14.81 - 15.38% decrease in execution time). The results indicate the development of a highly accurate livestock vocalisation classification algorithm, which forms the foundation for an automated livestock vocalisation detection system

    Identifying microphone from noisy recordings by using representative instance one class-classification approach

    Full text link
    Rapid growth of technical developments has created huge challenges for microphone forensics - a subcategory of audio forensic science, because of the availability of numerous digital recording devices and massive amount of recording data. Demand for fast and efficient methods to assure integrity and authenticity of information is becoming more and more important in criminal investigation nowadays. Machine learning has emerged as an important technique to support audio analysis processes of microphone forensic practitioners. However, its application to real life situations using supervised learning is still facing great challenges due to expensiveness in collecting data and updating system. In this paper, we introduce a new machine learning approach which is called One-class Classification (OCC) to be applied to microphone forensics; we demonstrate its capability on a corpus of audio samples collected from several microphones. In addition, we propose a representative instance classification framework (RICF) that can effectively improve performance of OCC algorithms for recording signal with noise. Experiment results and analysis indicate that OCC has the potential to benefit microphone forensic practitioners in developing new tools and techniques for effective and efficient analysis. © 2012 Academy Publisher
    corecore