2 research outputs found
Adaptive Representations of Sound for Automatic Insect Recognition
Insect population numbers and biodiversity have been rapidly declining with
time, and monitoring these trends has become increasingly important for
conservation measures to be effectively implemented. But monitoring methods are
often invasive, time and resource intense, and prone to various biases. Many
insect species produce characteristic sounds that can easily be detected and
recorded without large cost or effort. Using deep learning methods, insect
sounds from field recordings could be automatically detected and classified to
monitor biodiversity and species distribution ranges. We implement this using
recently published datasets of insect sounds (Orthoptera and Cicadidae) and
machine learning methods and evaluate their potential for acoustic insect
monitoring. We compare the performance of the conventional spectrogram-based
audio representation against LEAF, a new adaptive and waveform-based frontend.
LEAF achieved better classification performance than the mel-spectrogram
frontend by adapting its feature extraction parameters during training. This
result is encouraging for future implementations of deep learning technology
for automatic insect sound recognition, especially as larger datasets become
available.Comment: 35 pages, 11 figures. arXiv admin note: substantial text overlap with
arXiv:2211.0950
Long-distance Detection of Bioacoustic Events with Per-channel Energy Normalization
This paper proposes to perform unsupervised detection of bioacoustic events by pooling the magnitudes of spectrogram frames after per-channel energy normalization (PCEN). Although PCEN was originally developed for speech recognition, it also has beneficial effects in enhancing animal vocalizations, despite the presence of atmospheric absorption and intermittent noise. We prove that PCEN generalizes logarithm-based spectral flux, yet with a tunable time scale for background noise estimation. In comparison with pointwise logarithm, PCEN reduces false alarm rate by 50x in the near field and 5x in the far field, both on avian and marine bioacoustic datasets. Such improvements come at moderate computational cost and require no human intervention, thus heralding a promising future for PCEN in bioacoustics.14414