1,811 research outputs found
Mosquito Detection with Neural Networks: The Buzz of Deep Learning
Many real-world time-series analysis problems are characterised by scarce
data. Solutions typically rely on hand-crafted features extracted from the time
or frequency domain allied with classification or regression engines which
condition on this (often low-dimensional) feature vector. The huge advances
enjoyed by many application domains in recent years have been fuelled by the
use of deep learning architectures trained on large data sets. This paper
presents an application of deep learning for acoustic event detection in a
challenging, data-scarce, real-world problem. Our candidate challenge is to
accurately detect the presence of a mosquito from its acoustic signature. We
develop convolutional neural networks (CNNs) operating on wavelet
transformations of audio recordings. Furthermore, we interrogate the network's
predictive power by visualising statistics of network-excitatory samples. These
visualisations offer a deep insight into the relative informativeness of
components in the detection problem. We include comparisons with conventional
classifiers, conditioned on both hand-tuned and generic features, to stress the
strength of automatic deep feature learning. Detection is achieved with
performance metrics significantly surpassing those of existing algorithmic
methods, as well as marginally exceeding those attained by individual human
experts.Comment: For data and software related to this paper, see
http://humbug.ac.uk/kiskin2017/. Submitted as a conference paper to ECML 201
Robust sound event detection in bioacoustic sensor networks
Bioacoustic sensors, sometimes known as autonomous recording units (ARUs),
can record sounds of wildlife over long periods of time in scalable and
minimally invasive ways. Deriving per-species abundance estimates from these
sensors requires detection, classification, and quantification of animal
vocalizations as individual acoustic events. Yet, variability in ambient noise,
both over time and across sensors, hinders the reliability of current automated
systems for sound event detection (SED), such as convolutional neural networks
(CNN) in the time-frequency domain. In this article, we develop, benchmark, and
combine several machine listening techniques to improve the generalizability of
SED models across heterogeneous acoustic environments. As a case study, we
consider the problem of detecting avian flight calls from a ten-hour recording
of nocturnal bird migration, recorded by a network of six ARUs in the presence
of heterogeneous background noise. Starting from a CNN yielding
state-of-the-art accuracy on this task, we introduce two noise adaptation
techniques, respectively integrating short-term (60 milliseconds) and long-term
(30 minutes) context. First, we apply per-channel energy normalization (PCEN)
in the time-frequency domain, which applies short-term automatic gain control
to every subband in the mel-frequency spectrogram. Secondly, we replace the
last dense layer in the network by a context-adaptive neural network (CA-NN)
layer. Combining them yields state-of-the-art results that are unmatched by
artificial data augmentation alone. We release a pre-trained version of our
best performing system under the name of BirdVoxDetect, a ready-to-use detector
of avian flight calls in field recordings.Comment: 32 pages, in English. Submitted to PLOS ONE journal in February 2019;
revised August 2019; published October 201
Musical notes classification with Neuromorphic Auditory System using FPGA and a Convolutional Spiking Network
In this paper, we explore the capabilities of a sound
classification system that combines both a novel FPGA cochlear
model implementation and a bio-inspired technique based on a
trained convolutional spiking network. The neuromorphic
auditory system that is used in this work produces a form of
representation that is analogous to the spike outputs of the
biological cochlea. The auditory system has been developed using
a set of spike-based processing building blocks in the frequency
domain. They form a set of band pass filters in the spike-domain
that splits the audio information in 128 frequency channels, 64
for each of two audio sources. Address Event Representation
(AER) is used to communicate the auditory system with the
convolutional spiking network. A layer of convolutional spiking
network is developed and trained on a computer with the ability
to detect two kinds of sound: artificial pure tones in the presence
of white noise and electronic musical notes. After the training
process, the presented system is able to distinguish the different
sounds in real-time, even in the presence of white noise.Ministerio de Economía y Competitividad TEC2012-37868-C04-0
An Efficient Deep Learning-based approach for Recognizing Agricultural Pests in the Wild
One of the biggest challenges that the farmers go through is to fight insect
pests during agricultural product yields. The problem can be solved easily and
avoid economic losses by taking timely preventive measures. This requires
identifying insect pests in an easy and effective manner. Most of the insect
species have similarities between them. Without proper help from the
agriculturist academician it is very challenging for the farmers to identify
the crop pests accurately. To address this issue we have done extensive
experiments considering different methods to find out the best method among
all. This paper presents a detailed overview of the experiments done on mainly
a robust dataset named IP102 including transfer learning with finetuning,
attention mechanism and custom architecture. Some example from another dataset
D0 is also shown to show robustness of our experimented techniques
FR-ResNet s for Insect Pest Recognition
Insect pests are one of the main threats to the commercially important crops. An effective insect pest recognition method can avoid economic losses. In this paper, we proposed a new and simple structure based on the original residual block and named as feature reuse residual block which combines feature from the input signal of a residual block with the residual signal. In each feature reuse residual block, it enhances the capacity of representation by learning half and reuse half feature. By stacking the feature reuse residual block, we obtained the feature reuse residual network (FR-ResNet) and evaluated the performance on IP102 benchmark dataset. The experimental results showed that FR-ResNet can achieve significant performance improvement in terms of insect pest classification. Moreover, to demonstrate the adaptive of our approach, we applied it to various kinds of residual networks, including ResNet, Pre-ResNet, and WRN, and we tested the performance on a series of benchmark datasets: CIFAR-10, CIFAR-100, and SVHN. The experimental results showed that the performance can be improved obviously than original networks. Based on these experiments on CIFAR-10, CIFAR-100, SVHN, and IP102 benchmark datasets, it demonstrates the effectiveness of our approach
- …