9,689 research outputs found

    Mosquito Detection with Neural Networks: The Buzz of Deep Learning

    Full text link
    Many real-world time-series analysis problems are characterised by scarce data. Solutions typically rely on hand-crafted features extracted from the time or frequency domain allied with classification or regression engines which condition on this (often low-dimensional) feature vector. The huge advances enjoyed by many application domains in recent years have been fuelled by the use of deep learning architectures trained on large data sets. This paper presents an application of deep learning for acoustic event detection in a challenging, data-scarce, real-world problem. Our candidate challenge is to accurately detect the presence of a mosquito from its acoustic signature. We develop convolutional neural networks (CNNs) operating on wavelet transformations of audio recordings. Furthermore, we interrogate the network's predictive power by visualising statistics of network-excitatory samples. These visualisations offer a deep insight into the relative informativeness of components in the detection problem. We include comparisons with conventional classifiers, conditioned on both hand-tuned and generic features, to stress the strength of automatic deep feature learning. Detection is achieved with performance metrics significantly surpassing those of existing algorithmic methods, as well as marginally exceeding those attained by individual human experts.Comment: For data and software related to this paper, see http://humbug.ac.uk/kiskin2017/. Submitted as a conference paper to ECML 201

    Feature Selection for Big Visual Data: Overview and Challenges

    Get PDF
    International Conference Image Analysis and Recognition (ICIAR 2018, Póvoa de Varzim, Portugal

    On Using Backpropagation for Speech Texture Generation and Voice Conversion

    Full text link
    Inspired by recent work on neural network image generation which rely on backpropagation towards the network inputs, we present a proof-of-concept system for speech texture synthesis and voice conversion based on two mechanisms: approximate inversion of the representation learned by a speech recognition neural network, and on matching statistics of neuron activations between different source and target utterances. Similar to image texture synthesis and neural style transfer, the system works by optimizing a cost function with respect to the input waveform samples. To this end we use a differentiable mel-filterbank feature extraction pipeline and train a convolutional CTC speech recognition network. Our system is able to extract speaker characteristics from very limited amounts of target speaker data, as little as a few seconds, and can be used to generate realistic speech babble or reconstruct an utterance in a different voice.Comment: Accepted to ICASSP 201

    Open the Black Box – Visualising CNN to Understand Its Decisions on Road Network Performance Level

    Get PDF
    Visualisation helps explain the operating mechanisms of deep learning models, but its applications are rarely seen in traffic analysis. This paper employs a convolu-tional neural network (CNN) to evaluate road network performance level (NPL) and visualises the model to en-lighten how it works. A dataset of an urban road network covering a whole year is used to produce performance maps to train a CNN. In this process, a pretrained network is introduced to overcome the common issue of inadequa-cy of data in transportation research. Gradient weighted class activation mapping (Grad-CAM) is applied to vi-sualise the CNN, and four visualisation experiments are conducted. The results illustrate that the CNN focuses on different areas when it identifies the road network as dif-ferent NPLs, implying which region contributes the most to the deteriorating performance. There are particular visual patterns when the road network transits from one NPL to another, which may help performance prediction. Misclassified samples are analysed to determine how the CNN fails to make the right decisions, exposing the model’s deficiencies. The results indicate visualisation’s potential to contribute to comprehensive management strategies and effective model improvement
    corecore