14,124 research outputs found

    Estimating Avian Populations with Passive Acoustic Technology and Song Behavior

    Get PDF
    The need for improvements in avian wildlife monitoring efficiency, accuracy, and scope has led to use of new technologies such as autonomous recording units (ARUs). As a monitoring tool, passive acoustic recording has numerous benefits, but it is still limited to use in human-accessible areas. There is also need for monitoring technologies in areas that are inaccessible. Military installations, which host a disproportionately large number of threatened, endangered, and at-risk species compared to other federal lands, pose the accessibility problem with sizeable impact areas that are too hazardous for humans to access. This thesis introduces the Balloon Aerial Recording System (BARS), a novel technology that fuses acoustic and aerial strategies to address the problem of ground-based land accessibility. The primary objectives of this thesis were to create models that could be used to predict male songbird abundance from song cue-count data and to establish and implement an analytical pathway for bird population estimation from acoustic data recorded with the BARS. ARUs were used to study the song rates/behaviors of Prairie Warbler (Setophaga discolor), Bachman’s Sparrow (Peucaea aestivalis), Field Sparrow (Spizella pusilla), Grasshopper Sparrow (Ammodramous savannarum), and Henslow’s Sparrow (Ammodramus henslowii) across 3 military installations. Point-count and line-transect field tests were implemented to directly compare BARS data with that of human-observer techniques in both real-bird communities and simulated-bird communities (with known populations). Both thesis objectives were met for each focal species except Grasshopper Sparrow. Based on negative binomial regression models, song activity was positively related to male abundance and was negatively related to either day of breeding season or time of day. Song activity was also influenced by temperature, wind speed, or atmospheric pressure for some species. The BARS analytical method successfully predicted densities of Prairie Warbler, Bachman’s Sparrow, and Henslow’s Sparrow. Field tests of the BARS with simulated-bird communities revealed that species-specific footprints of detection are needed to further improve density estimates. Through this study, the BARS system has been validated and shown to be useful for documenting presence/absence of rare species, relative abundance of more common species, and in some cases, actual estimation of densities

    Robust sound event detection in bioacoustic sensor networks

    Full text link
    Bioacoustic sensors, sometimes known as autonomous recording units (ARUs), can record sounds of wildlife over long periods of time in scalable and minimally invasive ways. Deriving per-species abundance estimates from these sensors requires detection, classification, and quantification of animal vocalizations as individual acoustic events. Yet, variability in ambient noise, both over time and across sensors, hinders the reliability of current automated systems for sound event detection (SED), such as convolutional neural networks (CNN) in the time-frequency domain. In this article, we develop, benchmark, and combine several machine listening techniques to improve the generalizability of SED models across heterogeneous acoustic environments. As a case study, we consider the problem of detecting avian flight calls from a ten-hour recording of nocturnal bird migration, recorded by a network of six ARUs in the presence of heterogeneous background noise. Starting from a CNN yielding state-of-the-art accuracy on this task, we introduce two noise adaptation techniques, respectively integrating short-term (60 milliseconds) and long-term (30 minutes) context. First, we apply per-channel energy normalization (PCEN) in the time-frequency domain, which applies short-term automatic gain control to every subband in the mel-frequency spectrogram. Secondly, we replace the last dense layer in the network by a context-adaptive neural network (CA-NN) layer. Combining them yields state-of-the-art results that are unmatched by artificial data augmentation alone. We release a pre-trained version of our best performing system under the name of BirdVoxDetect, a ready-to-use detector of avian flight calls in field recordings.Comment: 32 pages, in English. Submitted to PLOS ONE journal in February 2019; revised August 2019; published October 201

    ORCA-SPOT: An Automatic Killer Whale Sound Detection Toolkit Using Deep Learning

    Get PDF
    Large bioacoustic archives of wild animals are an important source to identify reappearing communication patterns, which can then be related to recurring behavioral patterns to advance the current understanding of intra-specific communication of non-human animals. A main challenge remains that most large-scale bioacoustic archives contain only a small percentage of animal vocalizations and a large amount of environmental noise, which makes it extremely difficult to manually retrieve sufficient vocalizations for further analysis – particularly important for species with advanced social systems and complex vocalizations. In this study deep neural networks were trained on 11,509 killer whale (Orcinus orca) signals and 34,848 noise segments. The resulting toolkit ORCA-SPOT was tested on a large-scale bioacoustic repository – the Orchive – comprising roughly 19,000 hours of killer whale underwater recordings. An automated segmentation of the entire Orchive recordings (about 2.2 years) took approximately 8 days. It achieved a time-based precision or positive-predictive-value (PPV) of 93.2% and an area-under-the-curve (AUC) of 0.9523. This approach enables an automated annotation procedure of large bioacoustics databases to extract killer whale sounds, which are essential for subsequent identification of significant communication patterns. The code will be publicly available in October 2019 to support the application of deep learning to bioaoucstic research. ORCA-SPOT can be adapted to other animal species

    Polyphonic Sound Event Detection by using Capsule Neural Networks

    Full text link
    Artificial sound event detection (SED) has the aim to mimic the human ability to perceive and understand what is happening in the surroundings. Nowadays, Deep Learning offers valuable techniques for this goal such as Convolutional Neural Networks (CNNs). The Capsule Neural Network (CapsNet) architecture has been recently introduced in the image processing field with the intent to overcome some of the known limitations of CNNs, specifically regarding the scarce robustness to affine transformations (i.e., perspective, size, orientation) and the detection of overlapped images. This motivated the authors to employ CapsNets to deal with the polyphonic-SED task, in which multiple sound events occur simultaneously. Specifically, we propose to exploit the capsule units to represent a set of distinctive properties for each individual sound event. Capsule units are connected through a so-called "dynamic routing" that encourages learning part-whole relationships and improves the detection performance in a polyphonic context. This paper reports extensive evaluations carried out on three publicly available datasets, showing how the CapsNet-based algorithm not only outperforms standard CNNs but also allows to achieve the best results with respect to the state of the art algorithms

    Stacked Convolutional and Recurrent Neural Networks for Bird Audio Detection

    Full text link
    This paper studies the detection of bird calls in audio segments using stacked convolutional and recurrent neural networks. Data augmentation by blocks mixing and domain adaptation using a novel method of test mixing are proposed and evaluated in regard to making the method robust to unseen data. The contributions of two kinds of acoustic features (dominant frequency and log mel-band energy) and their combinations are studied in the context of bird audio detection. Our best achieved AUC measure on five cross-validations of the development data is 95.5% and 88.1% on the unseen evaluation data.Comment: Accepted for European Signal Processing Conference 201

    A toolbox for animal call recognition

    Get PDF
    Monitoring the natural environment is increasingly important as habit degradation and climate change reduce theworld’s biodiversity.We have developed software tools and applications to assist ecologists with the collection and analysis of acoustic data at large spatial and temporal scales.One of our key objectives is automated animal call recognition, and our approach has three novel attributes. First, we work with raw environmental audio, contaminated by noise and artefacts and containing calls that vary greatly in volume depending on the animal’s proximity to the microphone. Second, initial experimentation suggested that no single recognizer could dealwith the enormous variety of calls. Therefore, we developed a toolbox of generic recognizers to extract invariant features for each call type. Third, many species are cryptic and offer little data with which to train a recognizer. Many popular machine learning methods require large volumes of training and validation data and considerable time and expertise to prepare. Consequently we adopt bootstrap techniques that can be initiated with little data and refined subsequently. In this paper, we describe our recognition tools and present results for real ecological problems

    Acoustic behavior of melon-headed whales varies on a diel cycle.

    Get PDF
    Many terrestrial and marine species have a diel activity pattern, and their acoustic signaling follows their current behavioral state. Whistles and echolocation clicks on long-term recordings produced by melon-headed whales (Peponocephala electra) at Palmyra Atoll indicated that these signals were used selectively during different phases of the day, strengthening the idea of nighttime foraging and daytime resting with afternoon socializing for this species. Spectral features of their echolocation clicks changed from day to night, shifting the median center frequency up. Additionally, click received levels increased with increasing ambient noise during both day and night. Ambient noise over a wide frequency band was on average higher at night. The diel adjustment of click features might be a reaction to acoustic masking caused by these nighttime sounds. Similar adaptations have been documented for numerous taxa in response to noise. Or it could be, unrelated, an increase in biosonar source levels and with it a shift in center frequency to enhance detection distances during foraging at night. Call modifications in intensity, directionality, frequency, and duration according to echolocation task are well established for bats. This finding indicates that melon-headed whales have flexibility in their acoustic behavior, and they collectively and repeatedly adapt their signals from day- to nighttime circumstances

    Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection

    Get PDF
    Sound events often occur in unstructured environments where they exhibit wide variations in their frequency content and temporal structure. Convolutional neural networks (CNN) are able to extract higher level features that are invariant to local spectral and temporal variations. Recurrent neural networks (RNNs) are powerful in learning the longer term temporal context in the audio signals. CNNs and RNNs as classifiers have recently shown improved performances over established methods in various sound recognition tasks. We combine these two approaches in a Convolutional Recurrent Neural Network (CRNN) and apply it on a polyphonic sound event detection task. We compare the performance of the proposed CRNN method with CNN, RNN, and other established methods, and observe a considerable improvement for four different datasets consisting of everyday sound events.Comment: Accepted for IEEE Transactions on Audio, Speech and Language Processing, Special Issue on Sound Scene and Event Analysi
    • …
    corecore