Deep learning techniques for biological signal processing: Automatic detection of dolphin sounds

Abstract

openConsidering the heterogeneous underwater acoustic transmission context, detecting and distinguishing vocalizations of cetaceans has been a challenging area of recent interest. A promising venue to improve current detection systems is constituted by machine learning algorithms. In particular, Convolutional Neural Networks (CNNs) are considered one of the most promising deep learning techniques, since they have already excelled in problems involving the automatic processing of biological sounds. Human-annotated spectrograms can be used to teach CNNs how to distinguish between information in the time-frequency domain, thus enabling the detection and classification of marine mammal sounds. However, despite these promising capabilities machine learning suffers from a lack of labeled data, which calls for the adoption of transfer learning to create accurate models even when the availability of human taggers is limited. In this thesis, we developed a dolphin whistle detection framework based on deep learning models. In particular, we investigated the performance of large-scale pre-trained models (VGG16) and compared it with the performance of a vanilla Convolutional Neural Network and several baselines (logistic regression and Support Vector Machines). The pre-trained VGG16 model achieved the best detection performance, with an accuracy of 98,9\% on a left-out test dataset.Considering the heterogeneous underwater acoustic transmission context, detecting and distinguishing vocalizations of cetaceans has been a challenging area of recent interest. A promising venue to improve current detection systems is constituted by machine learning algorithms. In particular, Convolutional Neural Networks (CNNs) are considered one of the most promising deep learning techniques, since they have already excelled in problems involving the automatic processing of biological sounds. Human-annotated spectrograms can be used to teach CNNs how to distinguish between information in the time-frequency domain, thus enabling the detection and classification of marine mammal sounds. However, despite these promising capabilities machine learning suffers from a lack of labeled data, which calls for the adoption of transfer learning to create accurate models even when the availability of human taggers is limited. In this thesis, we developed a dolphin whistle detection framework based on deep learning models. In particular, we investigated the performance of large-scale pre-trained models (VGG16) and compared it with the performance of a vanilla Convolutional Neural Network and several baselines (logistic regression and Support Vector Machines). The pre-trained VGG16 model achieved the best detection performance, with an accuracy of 98,9\% on a left-out test dataset

    Similar works