10 research outputs found
Dynamic Batch Norm Statistics Update for Natural Robustness
DNNs trained on natural clean samples have been shown to perform poorly on
corrupted samples, such as noisy or blurry images. Various data augmentation
methods have been recently proposed to improve DNN's robustness against common
corruptions. Despite their success, they require computationally expensive
training and cannot be applied to off-the-shelf trained models. Recently, it
has been shown that updating BatchNorm (BN) statistics of an off-the-shelf
model on a single corruption improves its accuracy on that corruption
significantly. However, adopting the idea at inference time when the type of
corruption is unknown and changing decreases the effectiveness of this method.
In this paper, we harness the Fourier domain to detect the corruption type, a
challenging task in the image domain. We propose a unified framework consisting
of a corruption-detection model and BN statistics update that improves the
corruption accuracy of any off-the-shelf trained model. We benchmark our
framework on different models and datasets. Our results demonstrate about 8%
and 4% accuracy improvement on CIFAR10-C and ImageNet-C, respectively.
Furthermore, our framework can further improve the accuracy of state-of-the-art
robust models, such as AugMix and DeepAug
A deep active learning system for species identification and counting in camera trap images
1. A typical camera trap survey may produce millions of images that require slow, expensive manual review. Consequently, critical conservation questions may be answered too slowly to support decisionāmaking. Recent studies demonstrated the potential for computer vision to dramatically increase efficiency in imageābased biodiversity surveys; however, the literature has focused on projects with a large set of labeled training images, and hence many projects with a smaller set of labeled images cannot benefit from existing machine learning techniques. Furthermore, even sizable projects have struggled to adopt computer vision methods because classification models overfit to specific image backgrounds (i.e., camera locations).
2. In this paper, we combine the power of machine intelligence and human intelligence via a novel active learning system to minimize the manual work required to train a computer vision model. Furthermore, we utilize object detection models and transfer learning to prevent overfitting to camera locations. To our knowledge, this is the first work to apply an active learning approach to camera trap images.
3. Our proposed scheme can match stateāofātheāart accuracy on a 3.2 million image dataset with as few as 14,100 manual labels, which means decreasing manual labeling effort by over 99.5%. Our trained models are also less dependent on background pixels, since they operate only on cropped regions around animals.
4. The proposed active deep learning scheme can significantly reduce the manual labor required to extract information from camera trap images. Automation of information extraction will not only benefit existing camera trap projects, but can also catalyze the deployment of larger camera trap arrays
A deep active learning system for species identification and counting in camera trap images
1. A typical camera trap survey may produce millions of images that require slow, expensive manual review. Consequently, critical conservation questions may be answered too slowly to support decisionāmaking. Recent studies demonstrated the potential for computer vision to dramatically increase efficiency in imageābased biodiversity surveys; however, the literature has focused on projects with a large set of labeled training images, and hence many projects with a smaller set of labeled images cannot benefit from existing machine learning techniques. Furthermore, even sizable projects have struggled to adopt computer vision methods because classification models overfit to specific image backgrounds (i.e., camera locations).
2. In this paper, we combine the power of machine intelligence and human intelligence via a novel active learning system to minimize the manual work required to train a computer vision model. Furthermore, we utilize object detection models and transfer learning to prevent overfitting to camera locations. To our knowledge, this is the first work to apply an active learning approach to camera trap images.
3. Our proposed scheme can match stateāofātheāart accuracy on a 3.2 million image dataset with as few as 14,100 manual labels, which means decreasing manual labeling effort by over 99.5%. Our trained models are also less dependent on background pixels, since they operate only on cropped regions around animals.
4. The proposed active deep learning scheme can significantly reduce the manual labor required to extract information from camera trap images. Automation of information extraction will not only benefit existing camera trap projects, but can also catalyze the deployment of larger camera trap arrays
Recommended from our members
Insights and approaches using deep learning to classify wildlife.
The implementation of intelligent software to identify and classify objects and individuals in visual fields is a technology of growing importance to operatives in many fields, including wildlife conservation and management. To non-experts, the methods can be abstruse and the results mystifying. Here, in the context of applying cutting edge methods to classify wildlife species from camera-trap data, we shed light on the methods themselves and types of features these methods extract to make efficient identifications and reliable classifications. The current state of the art is to employ convolutional neural networks (CNN) encoded within deep-learning algorithms. We outline these methods and present results obtained in training a CNN to classify 20 African wildlife species with an overall accuracy of 87.5% from a dataset containing 111,467 images. We demonstrate the application of a gradient-weighted class-activation-mapping (Grad-CAM) procedure to extract the most salient pixels in the final convolution layer. We show that these pixels highlight features in particular images that in some cases are similar to those used to train humans to identify these species. Further, we used mutual information methods to identify the neurons in the final convolution layer that consistently respond most strongly across a set of images of one particular species. We then interpret the features in the image where the strongest responses occur, and present dataset biases that were revealed by these extracted features. We also used hierarchical clustering of feature vectors (i.e., the state of the final fully-connected layer in the CNN) associated with each image to produce a visual similarity dendrogram of identified species. Finally, we evaluated the relative unfamiliarity of images that were not part of the training set when these images were one of the 20 species "known" to our CNN in contrast to images of the species that were "unknown" to our CNN
Data from: Machine learning to classify animal species in camera trap images: applications in ecology
Motionāactivated cameras (ācamera trapsā) are increasingly used in ecological and management studies for remotely observing wildlife and are amongst the most powerful tools for wildlife research. However, studies involving camera traps result in millions of images that need to be analysed, typically by visually observing each image, in order to extract data that can be used in ecological analyses.
We trained machine learning models using convolutional neural networks with the ResNetā18 architecture and 3,367,383 images to automatically classify wildlife species from camera trap images obtained from five states across the United States. We tested our model on an independent subset of images not seen during training from the United States and on an outāofāsample (or āoutāofādistributionā in the machine learning literature) dataset of ungulate images from Canada. We also tested the ability of our model to distinguish empty images from those with animals in another outāofāsample dataset from Tanzania, containing a faunal community that was novel to the model.
The trained model classified approximately 2,000 images per minute on a laptop computer with 16 gigabytes of RAM. The trained model achieved 98% accuracy at identifying species in the United States, the highest accuracy of such a model to date. Outāofāsample validation from Canada achieved 82% accuracy and correctly identified 94% of images containing an animal in the dataset from Tanzania. We provide an r package (Machine Learning for Wildlife Image Classification) that allows the users to (a) use the trained model presented here and (b) train their own model using classified images of wildlife from their studies.
The use of machine learning to rapidly and accurately classify wildlife in camera trap images can facilitate nonāinvasive sampling designs in ecological studies by reducing the burden of manually analysing images. Our r package makes these methods accessible to ecologists
Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning
Having accurate, detailed, and up-to-date information about the location and behavior of animals in the wild would improve our ability to study and conserve ecosystems. We investigate the ability to automatically, accurately, and inexpensively collect such data, which could help catalyze the transformation of many fields of ecology, wildlife biology, zoology, conservation biology, and animal behavior into ābig dataā sciences. Motion-sensor ācamera trapsā enable collecting wildlife pictures inexpensively, unobtrusively, and frequently. However, extracting information from these pictures remains an expensive, time-consuming, manual task. We demonstrate that such information can be automatically extracted by deep learning, a cutting-edge type of artificial intelligence. We train deep convolutional neural networks to identify, count, and describe the behaviors of 48 species in the 3.2 million-image Snapshot Serengeti dataset. Our deep neural networks automatically identify animals with >93.8% accuracy, and we expect that number to improve rapidly in years to come. More importantly, if our system classifies only images it is confident about, our system can automate animal identification for 99.3% of the data while still performing at the same 96.6% accuracy as that of crowdsourced teams of human volunteers, saving >8.4 y (i.e., >17,000 h at 40 h/wk) of human labeling effort on this 3.2 million-image dataset. Those efficiency gains highlight the importance of using deep neural networks to automate data extraction from camera-trap images, reducing a roadblock for this widely used technology. Our results suggest that deep learning could enable the inexpensive, unobtrusive, high-volume, and even real-time collection of a wealth of information about vast numbers of animals in the wild