3,523 research outputs found

    ORCA-SPOT: An Automatic Killer Whale Sound Detection Toolkit Using Deep Learning

    Get PDF
    Large bioacoustic archives of wild animals are an important source to identify reappearing communication patterns, which can then be related to recurring behavioral patterns to advance the current understanding of intra-specific communication of non-human animals. A main challenge remains that most large-scale bioacoustic archives contain only a small percentage of animal vocalizations and a large amount of environmental noise, which makes it extremely difficult to manually retrieve sufficient vocalizations for further analysis – particularly important for species with advanced social systems and complex vocalizations. In this study deep neural networks were trained on 11,509 killer whale (Orcinus orca) signals and 34,848 noise segments. The resulting toolkit ORCA-SPOT was tested on a large-scale bioacoustic repository – the Orchive – comprising roughly 19,000 hours of killer whale underwater recordings. An automated segmentation of the entire Orchive recordings (about 2.2 years) took approximately 8 days. It achieved a time-based precision or positive-predictive-value (PPV) of 93.2% and an area-under-the-curve (AUC) of 0.9523. This approach enables an automated annotation procedure of large bioacoustics databases to extract killer whale sounds, which are essential for subsequent identification of significant communication patterns. The code will be publicly available in October 2019 to support the application of deep learning to bioaoucstic research. ORCA-SPOT can be adapted to other animal species

    Automatic Recognition of Mammal Genera on Camera-Trap Images using Multi-Layer Robust Principal Component Analysis and Mixture Neural Networks

    Full text link
    The segmentation and classification of animals from camera-trap images is due to the conditions under which the images are taken, a difficult task. This work presents a method for classifying and segmenting mammal genera from camera-trap images. Our method uses Multi-Layer Robust Principal Component Analysis (RPCA) for segmenting, Convolutional Neural Networks (CNNs) for extracting features, Least Absolute Shrinkage and Selection Operator (LASSO) for selecting features, and Artificial Neural Networks (ANNs) or Support Vector Machines (SVM) for classifying mammal genera present in the Colombian forest. We evaluated our method with the camera-trap images from the Alexander von Humboldt Biological Resources Research Institute. We obtained an accuracy of 92.65% classifying 8 mammal genera and a False Positive (FP) class, using automatic-segmented images. On the other hand, we reached 90.32% of accuracy classifying 10 mammal genera, using ground-truth images only. Unlike almost all previous works, we confront the animal segmentation and genera classification in the camera-trap recognition. This method shows a new approach toward a fully-automatic detection of animals from camera-trap images

    Interspecies Knowledge Transfer for Facial Keypoint Detection

    Full text link
    We present a method for localizing facial keypoints on animals by transferring knowledge gained from human faces. Instead of directly finetuning a network trained to detect keypoints on human faces to animal faces (which is sub-optimal since human and animal faces can look quite different), we propose to first adapt the animal images to the pre-trained human detection network by correcting for the differences in animal and human face shape. We first find the nearest human neighbors for each animal image using an unsupervised shape matching method. We use these matches to train a thin plate spline warping network to warp each animal face to look more human-like. The warping network is then jointly finetuned with a pre-trained human facial keypoint detection network using an animal dataset. We demonstrate state-of-the-art results on both horse and sheep facial keypoint detection, and significant improvement over simple finetuning, especially when training data is scarce. Additionally, we present a new dataset with 3717 images with horse face and facial keypoint annotations.Comment: CVPR 2017 Camera Read
    • …
    corecore