7 research outputs found

    Smart sampling of environmental audio recordings for biodiversity monitoring

    No full text
    This thesis contributes to the field of acoustic environmental monitoring by developing novel semiautomated methods of processing long audio recordings to conduct species richness surveys efficiently. These methods allow a machine to select rich subset of the recordings though estimations of acoustic variety, which can then be presented to the human listener for species identifications. This work represents a step towards more effective biodiversity monitoring of vocal species that can be performed at a larger scale than is possible with traditional methods

    Clustering and visualization of long-duration audio recordings for rapid exploration avian surveys

    No full text
    Acoustic recordings have been shown to be an effective way to conduct avian species surveys, whereby a trained expert listens to the audio and records observations, a task that can be very time consuming. In practice, most identification of species are first made by visual inspection of the spectrogram, with listening then performed for verification. This paper presents an approach for a surveyor to rapidly scan long duration recordings of environmental audio by automatically filtering parts with low activity and repetitions of the same call types. Recordings are segmented into fixed-length one-second non-overlapping clips. A classifier filters segments of low activity using features that are robust to different levels of background noise. The non-silent segments are then clustered using a feature representation derived from Time-domain Cepstral Coefficients, calculated from the discrete Fourier transform of downsampled spectrogram rows. This time-invariant feature representation allows for arbitrary segmentation, which is advantageous because segmentation of complex audio soundscapes into individual events is difficult and prone to errors. A visualization tool displays a representative segment from each cluster, grouped hierarchically, allowing an ecological researcher to very rapidly visually scan through the entire variety of audio events that occurred throughout the long recording, without wasting time on silent portions of the recording or repetitions of the same call-type. This tool provides functionality missing from both time-consuming audio players and black-box pattern recognizers, allowing conservation scientists to visually explore the entirety of their recordings

    Acoustic auto-encoders for biodiversity assessment

    No full text
    Continuous audio recordings are playing an ever more important role in conservation and biodiversity monitoring, however, listening to these recordings is often infeasible, as they can be thousands of hours long. Automating analysis using machine learning algorithms requires a feature representation. In this paper we propose a technique for learning a general feature representation from unlabelled audio using auto-encoders, which can be used for analysing environmental audio on a small timescale.We start by segmenting the audio data into non-overlapping 1-s long chunks and generating audio spectrograms. These audio spectrograms are then used to train a basic auto-encoder, with the output of the encoder network being used to generate the feature representation. We have found that at a 1-s timescale, our feature representation offers marginal improvements over “acoustic indices”, a common representation for analysing environmental audio

    Analyzing big environmental audio with frequency preserving autoencoders

    No full text
    Continuous audio recordings are playing an ever more important role in conservation and biodiversity monitoring, however, listening to these recordings is often infeasible, as they can be thousands of hours long. Automating analysis using machine learning is in high demand. However, these algorithms require a feature representation. Several methods for generating feature representations for these data have been developed, using techniques such as domain-specific features and deep learning. However, domain-specific features are unlikely to be an ideal representation of the data and deep learning methods often require extensively labeled data.In this paper, we propose a method for generating a frequency-preserving autoencoder-based feature representation for unlabeled ecological audio. We evaluate multiple frequency-preserving autoencoder-based feature representations using a hierarchical clustering sample task. We compare this to a basic autoencoder feature representation, MFCC, and spectral acoustic indices. Experimental results show that some of these non-square autoencoder architectures compare well to these existing feature representations.This novel method for generating a feature representation for unlabeled ecological audio will offer a fast, general way for ecologists to generate a feature representation of their audio, which does not require extensively labeled data.</p

    A Convolutional Neural Network Bird Species Recognizer Built From Little Data by Iteratively Training, Detecting, and Labeling

    No full text
    Automatically detecting the calls of species of interest in audio recordings is a common but often challenging exercise in ecoacoustics. This challenge is increasingly being tackled with deep neural networks that generally require a rich set of training data. Often, the available training data might not be from the same geographical region as the study area and so may contain important differences. This mismatch in training and deployment datasets can impact the accuracy at deployment, mainly due to confusing sounds absent from the training data generating false positives, as well as some variation in call types. We have developed a multiclass convolutional neural network classifier for seven target bird species to track presence absence of these species over time in cotton growing regions. We started with no training data from cotton regions but we did have an unbalanced library of calls from other locations. Due to the relative scarcity of calls in recordings from cotton regions, manually scanning and labeling the recordings was prohibitively time consuming. In this paper we describe our process of overcoming this data mismatch to develop a recognizer that performs well on the cotton recordings for most classes. The recognizer was trained on recordings from outside the cotton regions and then applied to unlabeled cotton recordings. Based on the resulting outputs a verification set was chosen to be manually tagged and incorporated in the training set. By iterating this process, we were gradually able to build the training set of cotton audio examples. Through this process, we were able to increase the average class F1 score (the harmonic mean of precision and recall) of the recognizer on target recordings from 0.45 in the first iteration to 0.74.</p

    The Australian Acoustic Observatory

    No full text
    Abstract Fauna surveys are traditionally manual, and hence limited in scale, expensive and labour‐intensive. Low‐cost hardware and storage mean that acoustic recording now has the potential to efficiently build scale in terrestrial fauna surveys, both spatially and temporally. With this aim, we have constructed the Australian Acoustic Observatory. It provides a direct and permanent record of terrestrial soundscapes through continuous recording across Australian ecoregions, including those periodically subject to fire and flood, when manual surveys are dangerous or impossible. The observatory comprises 360 permanent listening stations deployed across Australia. Groups of four sensors are deployed at each of 90 sites, placed strategically across ecoregions, to provide representative datasets of soundscapes. Each station continuously records sound, resulting in year‐round data collection. All data are made freely available under an open access licence. The Australian Acoustic Observatory is the world's first terrestrial acoustic observatory of this size. It provides continental‐scale environmental monitoring of unparalleled spatial extent, temporal resolution and archival stability. It enables new approaches to understanding ecosystems, long‐term environmental change, data visualization and acoustic science that will only increase in scientific value over time, particularly as others replicate the design in other parts of the world

    The Australian Acoustic Observatory

    No full text
    Fauna surveys are traditionally manual, and hence limited in scale, expensive and labour-intensive. Low-cost hardware and storage mean that acoustic recording now has the potential to efficiently build scale in terrestrial fauna surveys, both spatially and temporally. With this aim, we have constructed the Australian Acoustic Observatory. It provides a direct and permanent record of terrestrial soundscapes through continuous recording across Australian ecoregions, including those periodically subject to fire and flood, when manual surveys are dangerous or impossible. The observatory comprises 360 permanent listening stations deployed across Australia. Groups of four sensors are deployed at each of 90 sites, placed strategically across ecoregions, to provide representative datasets of soundscapes. Each station continuously records sound, resulting in year-round data collection. All data are made freely available under an open access licence. The Australian Acoustic Observatory is the world's first terrestrial acoustic observatory of this size. It provides continental-scale environmental monitoring of unparalleled spatial extent, temporal resolution and archival stability. It enables new approaches to understanding ecosystems, long-term environmental change, data visualization and acoustic science that will only increase in scientific value over time, particularly as others replicate the design in other parts of the world
    corecore