1,591 research outputs found
Automatic Environmental Sound Recognition: Performance versus Computational Cost
In the context of the Internet of Things (IoT), sound sensing applications
are required to run on embedded platforms where notions of product pricing and
form factor impose hard constraints on the available computing power. Whereas
Automatic Environmental Sound Recognition (AESR) algorithms are most often
developed with limited consideration for computational cost, this article seeks
which AESR algorithm can make the most of a limited amount of computing power
by comparing the sound classification performance em as a function of its
computational cost. Results suggest that Deep Neural Networks yield the best
ratio of sound classification accuracy across a range of computational costs,
while Gaussian Mixture Models offer a reasonable accuracy at a consistently
small cost, and Support Vector Machines stand between both in terms of
compromise between accuracy and computational cost
Audio-based anomaly detection on edge devices via self-supervision and spectral analysis
In real-world applications, audio surveillance is often performed by large models that can detect many types of anomalies. However, typical approaches are based on centralized solutions characterized by significant issues related to privacy and data transport costs. In addition, the large size of these models prevented a shift to contexts with limited resources, such as edge devices computing. In this work we propose conv-SPAD, a method for convolutional SPectral audio-based Anomaly Detection that takes advantage of common tools for spectral analysis and a simple autoencoder to learn the underlying condition of normality of real scenarios. Using audio data collected from real scenarios and artificially corrupted with anomalous sound events, we test the ability of the proposed model to learn normal conditions and detect anomalous events. It shows performances in line with larger models, often outperforming them. Moreover, the model’s small size makes it usable in contexts with limited resources, such as edge devices hardware
Deep Recurrent Neural Network-Based Autoencoders for Acoustic Novelty Detection
In the emerging field of acoustic novelty detection, most research efforts are devoted to probabilistic approaches such as mixture models or state-space models. Only recent studies introduced (pseudo-)generative models for acoustic novelty detection with recurrent neural networks in the form of an autoencoder. In these approaches, auditory spectral features of the next short term frame are predicted from the previous frames by means of Long-Short Term Memory recurrent denoising autoencoders. The reconstruction error between the input and the output of the autoencoder is used as activation signal to detect novel events. There is no evidence of studies focused on comparing previous efforts to automatically recognize novel events from audio signals and giving a broad and in depth evaluation of recurrent neural network-based autoencoders. The present contribution aims to consistently evaluate our recent novel approaches to fill this white spot in the literature and provide insight by extensive evaluations carried out on three databases: A3Novelty, PASCAL CHiME, and PROMETHEUS. Besides providing an extensive analysis of novel and state-of-the-art methods, the article shows how RNN-based autoencoders outperform statistical approaches up to an absolute improvement of 16.4% average F-measure over the three databases
Motor simulation via coupled internal models using sequential Monte Carlo
We describe a generative Bayesian model for action understanding in which inverse-forward internal model pairs are considered \u27hypotheses\u27 of plausible action goals that are explored in parallel via an approximate inference mechanism based on sequential Monte Carlo methods. The reenactment of internal model pairs can be considered a form of motor simulation, which supports both perceptual prediction and action understanding at the goal level. However, this procedure is generally considered to be computationally inefficient. We present a model that dynamically reallocates computational resources to more accurate internal models depending on both the available prior information and the prediction error of the inverse-forward models, and which leads to successful action recognition. We present experimental results that test the robustness and efficiency of our model in real-world scenarios
- …