Search CORE

36 research outputs found

Mosquito Detection with Neural Networks: The Buzz of Deep Learning

Author: Kiskin Ivan
Orozco Bernardo Pérez
Roberts Stephen
Sinka Marianne
Willis Kathy
Windebank Theo
Zilli Davide
Publication venue
Publication date: 15/05/2017
Field of study

Many real-world time-series analysis problems are characterised by scarce data. Solutions typically rely on hand-crafted features extracted from the time or frequency domain allied with classification or regression engines which condition on this (often low-dimensional) feature vector. The huge advances enjoyed by many application domains in recent years have been fuelled by the use of deep learning architectures trained on large data sets. This paper presents an application of deep learning for acoustic event detection in a challenging, data-scarce, real-world problem. Our candidate challenge is to accurately detect the presence of a mosquito from its acoustic signature. We develop convolutional neural networks (CNNs) operating on wavelet transformations of audio recordings. Furthermore, we interrogate the network's predictive power by visualising statistics of network-excitatory samples. These visualisations offer a deep insight into the relative informativeness of components in the detection problem. We include comparisons with conventional classifiers, conditioned on both hand-tuned and generic features, to stress the strength of automatic deep feature learning. Detection is achieved with performance metrics significantly surpassing those of existing algorithmic methods, as well as marginally exceeding those attained by individual human experts.Comment: For data and software related to this paper, see http://humbug.ac.uk/kiskin2017/. Submitted as a conference paper to ECML 201

arXiv.org e-Print Archive

University of Surrey

Unsupervised classification to improve the quality of a bird song recording dataset

Author: Cesne Maxime Le
Haupert Sylvain
Michaud Félix
Sueur Jérôme
Publication venue: 'Elsevier BV'
Publication date: 15/02/2023
Field of study

Open audio databases such as Xeno-Canto are widely used to build datasets to explore bird song repertoire or to train models for automatic bird sound classification by deep learning algorithms. However, such databases suffer from the fact that bird sounds are weakly labelled: a species name is attributed to each audio recording without timestamps that provide the temporal localization of the bird song of interest. Manual annotations can solve this issue, but they are time consuming, expert-dependent, and cannot run on large datasets. Another solution consists in using a labelling function that automatically segments audio recordings before assigning a label to each segmented audio sample. Although labelling functions were introduced to expedite strong label assignment, their classification performance remains mostly unknown. To address this issue and reduce label noise (wrong label assignment) in large bird song datasets, we introduce a data-centric novel labelling function composed of three successive steps: 1) time-frequency sound unit segmentation, 2) feature computation for each sound unit, and 3) classification of each sound unit as bird song or noise with either an unsupervised DBSCAN algorithm or the supervised BirdNET neural network. The labelling function was optimized, validated, and tested on the songs of 44 West-Palearctic common bird species. We first showed that the segmentation of bird songs alone aggregated from 10% to 83% of label noise depending on the species. We also demonstrated that our labelling function was able to significantly reduce the initial label noise present in the dataset by up to a factor of three. Finally, we discuss different opportunities to design suitable labelling functions to build high-quality animal vocalizations with minimum expert annotation effort

arXiv.org e-Print Archive

Automated call detection for acoustic surveys with structured calls of varying length

Author: Borchers David L.
Wang Yuheng
Ye Juan
Publication venue: 'Wiley'
Publication date: 05/05/2022
Field of study

Funding: Y.W. is partly funded by the China Scholarship Council (CSC) for Ph.D. study at the University of St Andrews, UK.1. When recorders are used to survey acoustically conspicuous species, identification calls of the target species in recordings is essential for estimating density and abundance. We investigate how well deep neural networks identify vocalisations consisting of phrases of varying lengths, each containing a variable number of syllables. We use recordings of Hainan gibbon (Nomascus hainanus) vocalisations to develop and test the methods. 2. We propose two methods for exploiting the two-level structure of such data. The first combines convolutional neural network (CNN) models with a hidden Markov model (HMM) and the second uses a convolutional recurrent neural network (CRNN). Both models learn acoustic features of syllables via a CNN and temporal correlations of syllables into phrases either via an HMM or recurrent network. We compare their performance to commonly used CNNs LeNet and VGGNet, and support vector machine (SVM). We also propose a dynamic programming method to evaluate how well phrases are predicted. This is useful for evaluating performance when vocalisations are labelled by phrases, not syllables. 3. Our methods perform substantially better than the commonly used methods when applied to the gibbon acoustic recordings. The CRNN has an F-score of 90% on phrase prediction, which is 18% higher than the best of the SVM or LeNet and VGGNet methods. HMM post-processing raised the F-score of these last three methods to as much as 87%. The number of phrases is overestimated by CNNs and SVM, leading to error rates between 49% and 54%. With HMM, these error rates can be reduced to 0.4% at the lowest. Similarly, the error rate of CRNN's prediction is no more than 0.5%. 4. CRNNs are better at identifying phrases of varying lengths composed of a varying number of syllables than simpler CNN or SVM models. We find a CRNN model to be best at this task, with a CNN combined with an HMM performing almost as well. We recommend that these kinds of models are used for species whose vocalisations are structured into phrases of varying lengths.Publisher PDFPeer reviewe

University of St. Andrews - Pure

St Andrews Research Repository

Searching for periodic signals in kinematic distributions using continuous wavelet transforms

Author: Beauchesne Hugues
Kats Yevgeny
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/03/2020
Field of study

Many models of physics beyond the Standard Model include towers of particles whose masses follow an approximately periodic pattern with little spacing between them. These resonances might be too weak to detect individually, but could be discovered as a group by looking for periodic signals in kinematic distributions. The continuous wavelet transform, which indicates how much a given frequency is present in a signal at a given time, is an ideal tool for this. In this paper, we present a series of methods through which continuous wavelet transforms can be used to discover periodic signals in kinematic distributions. Some of these methods are based on a simple test statistic, while others make use of machine learning techniques. Some of the methods are meant to be used with a particular model in mind, while others are model-independent. We find that continuous wavelet transforms can give bounds comparable to current searches and, in some cases, be sensitive to signals that would go undetected by standard experimental strategies.Comment: 22 pages, 7 figures, matches version published in EPJ

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Improving Template-Based Bird Sound Identification

Author: Lauha Patrik
Publication venue: Helsingfors universitet
Publication date: 01/01/2021
Field of study

Automatic bird sound recognition has been studied by computer scientists since late 1990s. Various techniques have been exploited, but no general method, that could even nearly match the performance of a human expert, has been developed yet. In this thesis, the subject is approached by reviewing alternative methods for cross-correlation as a similarity measure between two signals in template-based bird sound recognition models. Template-specific binary classification models are fit with different methods and their performance is compared. The contemplated methods are template averaging and procession before applying cross-correlation, use of texture features as additional predictors, and feature extraction through transfer learning with convolutional neural networks. It is shown that the classification performance of template-specific models can be improved by template refinement and utilizing neural networks’ ability to automatically extract relevant features from bird sound spectrograms

Helsingin yliopiston digitaalinen arkisto

Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE 2022)

Author
Publication venue: Tampere University
Publication date: 10/11/2022
Field of study

Trepo - Institutional Repository of Tampere University

Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE 2022)

Author: Lagrange Mathieu
Mesaros Annamaria
Pellegrini Thomas
Richard Gael
Serizel Romain
Stowell Dan
Publication venue: Tampere University
Publication date: 01/11/2022
Field of study

INRIA a CCSD electronic archive server