2 research outputs found

    Sequence Information Channel Concatenation for Improving Camera Trap Image Burst Classification

    Full text link
    Camera Traps are extensively used to observe wildlife in their natural habitat without disturbing the ecosystem. This could help in the early detection of natural or human threats to animals, and help towards ecological conservation. Currently, a massive number of such camera traps have been deployed at various ecological conservation areas around the world, collecting data for decades, thereby requiring automation to detect images containing animals. Existing systems perform classification to detect if images contain animals by considering a single image. However, due to challenging scenes with animals camouflaged in their natural habitat, it sometimes becomes difficult to identify the presence of animals from merely a single image. We hypothesize that a short burst of images instead of a single image, assuming that the animal moves, makes it much easier for a human as well as a machine to detect the presence of animals. In this work, we explore a variety of approaches, and measure the impact of using short image sequences (burst of 3 images) on improving the camera trap image classification. We show that concatenating masks containing sequence information and the images from the 3-image-burst across channels, improves the ROC AUC by 20% on a test-set from unseen camera-sites, as compared to an equivalent model that learns from a single image.Comment: 8 pages, 4 figures, 2 tables. Git repository can be found at: https://github.com/bhuvi3/camera_trap_animal_classificatio

    Improving robustness of image recognition through artificial image augmentation

    Get PDF
    Deep learning based computer vision technologies can offer a number of advantages over manual labour inspection methods such as reduced operational costs and efficiency improvements. However, they are known to be unreliable in certain situations, especially when input images contain augmentations such as occlusion or distortion that computer vision models have not been trained on. While augmentations can be mitigated by controlling some situations, this is not always possible, especially in outdoor environments. To address this issue, one common approach is supplemental robustness training using augmented training data, which involves training models on images containing the expected augmentations to improve performance. However, this approach requires collection of a substantial volume of augmented images for each expected augmentation, making it time-consuming and costly depending on the difficulty involved in reproducing each augmentation. This thesis explores the viability of using artificially rendered augmentations on unaugmented images as a substitute for the manual collection and preparation of naturally augmented data for image recognition and object detection models. Specifically, this thesis recreates nine environmental augmentations that commonly occur within outdoor environments and evaluates their impact on model performance on three datasets. The findings of this thesis indicate potential for using artificially generated augmentations as substitutes for naturally occurring augmentations. It is anticipated that further research in this area will enable more reliable image recognition and object detection in less controllable environments, thus improving the results of these technologies in uncertain situations
    corecore