46 research outputs found

    Land cover mapping at very high resolution with rotation equivariant CNNs: towards small yet accurate models

    Full text link
    In remote sensing images, the absolute orientation of objects is arbitrary. Depending on an object's orientation and on a sensor's flight path, objects of the same semantic class can be observed in different orientations in the same image. Equivariance to rotation, in this context understood as responding with a rotated semantic label map when subject to a rotation of the input image, is therefore a very desirable feature, in particular for high capacity models, such as Convolutional Neural Networks (CNNs). If rotation equivariance is encoded in the network, the model is confronted with a simpler task and does not need to learn specific (and redundant) weights to address rotated versions of the same object class. In this work we propose a CNN architecture called Rotation Equivariant Vector Field Network (RotEqNet) to encode rotation equivariance in the network itself. By using rotating convolutions as building blocks and passing only the the values corresponding to the maximally activating orientation throughout the network in the form of orientation encoding vector fields, RotEqNet treats rotated versions of the same object with the same filter bank and therefore achieves state-of-the-art performances even when using very small architectures trained from scratch. We test RotEqNet in two challenging sub-decimeter resolution semantic labeling problems, and show that we can perform better than a standard CNN while requiring one order of magnitude less parameters

    Learning deep structured active contours end-to-end

    Full text link
    The world is covered with millions of buildings, and precisely knowing each instance's position and extents is vital to a multitude of applications. Recently, automated building footprint segmentation models have shown superior detection accuracy thanks to the usage of Convolutional Neural Networks (CNN). However, even the latest evolutions struggle to precisely delineating borders, which often leads to geometric distortions and inadvertent fusion of adjacent building instances. We propose to overcome this issue by exploiting the distinct geometric properties of buildings. To this end, we present Deep Structured Active Contours (DSAC), a novel framework that integrates priors and constraints into the segmentation process, such as continuous boundaries, smooth edges, and sharp corners. To do so, DSAC employs Active Contour Models (ACM), a family of constraint- and prior-based polygonal models. We learn ACM parameterizations per instance using a CNN, and show how to incorporate all components in a structured output model, making DSAC trainable end-to-end. We evaluate DSAC on three challenging building instance segmentation datasets, where it compares favorably against state-of-the-art. Code will be made available.Comment: To appear, CVPR 201

    Counting using deep learning regression gives value to ecological surveys

    Get PDF
    Many ecological studies rely on count data and involve manual counting of objects of interest, which is time-consuming and especially disadvantageous when time in the field or lab is limited. However, an increasing number of works uses digital imagery, which opens opportunities to automatise counting tasks. In this study, we use machine learning to automate counting objects of interest without the need to label individual objects. By leveraging already existing image-level annotations, this approach can also give value to historical data that were collected and annotated over longer time series (typical for many ecological studies), without the aim of deep learning applications. We demonstrate deep learning regression on two fundamentally different counting tasks: (i) daily growth rings from microscopic images of fish otolith (i.e., hearing stone) and (ii) hauled out seals from highly variable aerial imagery. In the otolith images, our deep learning-based regressor yields an RMSE of 3.40 day-rings and an [Formula: see text] of 0.92. Initial performance in the seal images is lower (RMSE of 23.46 seals and [Formula: see text] of 0.72), which can be attributed to a lack of images with a high number of seals in the initial training set, compared to the test set. We then show how to improve performance substantially (RMSE of 19.03 seals and [Formula: see text] of 0.77) by carefully selecting and relabelling just 100 additional training images based on initial model prediction discrepancy. The regression-based approach used here returns accurate counts ([Formula: see text] of 0.92 and 0.77 for the rings and seals, respectively), directly usable in ecological research

    Fine-grained Population Mapping from Coarse Census Counts and Open Geodata

    Full text link
    Fine-grained population maps are needed in several domains, like urban planning, environmental monitoring, public health, and humanitarian operations. Unfortunately, in many countries only aggregate census counts over large spatial units are collected, moreover, these are not always up-to-date. We present POMELO, a deep learning model that employs coarse census counts and open geodata to estimate fine-grained population maps with 100m ground sampling distance. Moreover, the model can also estimate population numbers when no census counts at all are available, by generalizing across countries. In a series of experiments for several countries in sub-Saharan Africa, the maps produced with POMELOare in good agreement with the most detailed available reference counts: disaggregation of coarse census counts reaches R2 values of 85-89%; unconstrained prediction in the absence of any counts reaches 48-69%

    Perspectives in machine learning for wildlife conservation

    Get PDF
    Data acquisition in animal ecology is rapidly accelerating due to inexpensive and accessible sensors such as smartphones, drones, satellites, audio recorders and bio-logging devices. These new technologies and the data they generate hold great potential for large-scale environmental monitoring and understanding, but are limited by current data processing approaches which are inefficient in how they ingest, digest, and distill data into relevant information. We argue that machine learning, and especially deep learning approaches, can meet this analytic challenge to enhance our understanding, monitoring capacity, and conservation of wildlife species. Incorporating machine learning into ecological workflows could improve inputs for population and behavior models and eventually lead to integrated hybrid modeling tools, with ecological models acting as constraints for machine learning models and the latter providing data-supported insights. In essence, by combining new machine learning approaches with ecological domain knowledge, animal ecologists can capitalize on the abundance of data generated by modern sensor technologies in order to reliably estimate population abundances, study animal behavior and mitigate human/wildlife conflicts. To succeed, this approach will require close collaboration and cross-disciplinary education between the computer science and animal ecology communities in order to ensure the quality of machine learning approaches and train a new generation of data scientists in ecology and conservation

    Interactive machine vision for wildlife conservation

    No full text
    The loss rate of endangered animal species has reached levels that are critical enough for our time to be called the sixth mass extinction. Families of vertebrates and large mammals, such as Rhinocerotidae, are likely to become extinct in a few years unless countermeasures are taken. Before doing so, however, it is imperative to assess current animal population sizes through wildlife censuses. Furthermore, conservation efforts require animal populations to be monitored over time, which implies conducting census repetitions over multiple years. Recent developments in technology have paved the way for animal census efforts of unprecedented accuracies and scales, predominantly through the employment of Unmanned Aerial Vehicles (UAVs). UAVs allow for acquiring aerial imagery of vast areas over e.g. a wildlife reserve, and thereby provide evidence of the abundance and location of individuals in a safe manner. Hitherto, the main challenge of UAV-enforced animal censuses has been the stage of manual photo-interpretation, in which animals have to be tediously identified and annotated by hand in potentially tens of thousands of aerial images. To this end, automated image understanding through Machine Learning (ML) and Computer Vision (CV) provides exciting potential for accelerating applications that rely on large-scale datasets, such as image-based aerial animal censuses. Employing machines to detect animals could greatly reduce the efforts required by humans, and therefore lead to vastly increased efficiency in the census process overall. This thesis aims at advancing wildlife conservation efforts by means of automated machine vision methodologies. In a first step, this entails finding new ways to optimize CV algorithms for the task of animal detection in UAV imagery. In a second step, it requires procedures to reuse such detection models for new image data in the context of census repetitions for population monitoring. However, the benefit of machine vision reaches beyond a mere automation of photo-interpretation: a recurrent key principle of this thesis is the concept of interactivity, where CV models and humans work hand-in-hand by reinforcing each other. The result is a census monitoring environment for UAV images, in which machine vision technology actively assists humans in the process. Effectively, when all methodologies proposed throughout this thesis are combined, human annotation efforts are reduced to a fraction, and further simplified in complexity. Chapter 2 addresses the challenges of employing state-of-the-art CV models, known as Convolutional Neural Networks (CNNs), for aerial wildlife detection. Multiple heuristics are presented to train such models properly, all of which target different obstacles of the model training process. Experiments show a significant increase in animal prediction quality, if a CNN is optimized in an appropriate way. Chapter 3 employs this CNN for reusage over new data acquisitions, e.g. in a census monitoring setting. Simply running the CNN over a new dataset to predict animals directly is often not possible, due to differences in characteristics between the datasets known as domain shifts. This chapter presents methodologies to adapt CNNs to new datasets with minimal effort possible, and employs humans in the process in an interactive manner to do so. Results show that less than half a percent of the images need to be reviewed by humans to find more than 80% of the animals in the new campaign. In Chapter 4, human annotation efforts themselves are addressed and reduced in complexity. Traditional settings require human annotators to draw bounding boxes around animals, which may become prohibitively expensive for large image datasets. This chapter instead explores the concept of weakly-supervised object detection, where only simple presence/absence information of animals per image is requested from the annotators. Unlike bounding boxes, an image-wide annotation can be provided in a second. It was found that a CNN, trained on this simpler information alone, is already able to localize animals by itself to a certain degree. However, if spatial bounding boxes are added for just three training images, the CNN predicts animals with the same accuracy as its fully-supervised sibling from Chapter 2. Finally, Chapter 5 combines all findings and models into an integrated census software environment, denoted as Annotation Interface that Does Everything (AIDE). To the best of the author’s knowledge, AIDE is the first software solution that explicitly integrates machine vision technology into the labeling process in an interactive manner: in AIDE, CNNs are used to predict animals in a large set of unlabeled data, and further learn directly from annotations provided by humans on the images. The result is a positive feedback loop where humans and machine reinforce each other. A conducted user study shows that machine vision support provides a four-fold increase in the number of animals found in a given time, compared to an unassisted annotation setting on the same dataset. At the time of writing, AIDE is actively employed by conservation agencies in Tanzania and under consideration by other forces around the globe for potential usage. This thesis highlights the importance of interactive machine vision for wildlife conservation, and provides solutions that not only advance the field in a scientific context, but also have a direct impact on wildlife conservation through population monitoring

    Detecting mammals in UAV images: Best practices to address a substantially imbalanced dataset with deep learning

    No full text
    Knowledge over the number of animals in large wildlife reserves is a vital necessity for park rangers in their efforts to protect endangered species. Manual animal censuses are dangerous and expensive, hence Unmanned Aerial Vehicles (UAVs) with consumer level digital cameras are becoming a popular alternative tool to estimate livestock. Several works have been proposed that semi-automatically process UAV images to detect animals, of which some employ Convolutional Neural Networks (CNNs), a recent family of deep learning algorithms that proved very effective in object detection in large datasets from computer vision. However, the majority of works related to wildlife focuses only on small datasets (typically subsets of UAV campaigns), which might be detrimental when presented with the sheer scale of real study areas for large mammal census. Methods may yield thousands of false alarms in such cases. In this paper, we study how to scale CNNs to large wildlife census tasks and present a number of recommendations to train a CNN on a large UAV dataset. We further introduce novel evaluation protocols that are tailored to censuses and model suitability for subsequent human verification of detections. Using our recommendations, we are able to train a CNN reducing the number of false positives by an order of magnitude compared to previous state-of-the-art. Setting the requirements at 90% recall, our CNN allows to reduce the amount of data required for manual verification by three times, thus making it possible for rangers to screen all the data acquired efficiently and to detect almost all animals in the reserve automatically

    Learning class- and location-specific priors for urban semantic labeling with CNNs

    Full text link
    This paper addresses the problem of semantic image labeling of urban remote sensing images into land cover maps. We exploit the prior knowledge that cities are composed of comparable spatial arrangements of urban objects, such as buildings. To do so, we cluster OpenStreetMap (OSM) building footprints into groups with similar local statistics, corresponding to different types of urban zones. We use the per-cluster expected building fraction to correct for over- and underrepresentation of classes predicted by a Convolutional Neural Network (CNN), using a Conditional Random Field (CRF). Results indicate a substantial improvement in both numerical and visual accuracy of the labeled maps

    Fast animal detection in UAV images using convolutional neural networks

    No full text
    Illegal wildlife poaching poses one severe threat to the environment. Measures to stem poaching have only been with limited success, mainly due to efforts required to keep track of wildlife stock and animal tracking. Recent developments in remote sensing have led to low-cost Unmanned Aerial Vehicles (UAVs), facilitating quick and repeated image acquisitions over vast areas. In parallel, progress in object detection in computer vision yielded unprecedented performance improvements, partially attributable to algorithms like Convolutional Neural Networks (CNNs). We present an object detection method tailored to detect large animals in UAV images. We achieve a substantial increase in precision over a robust state-of-the-art model on a dataset acquired over the Kuzikus wildlife reserve park in Namibia. Furthermore, our model processes data at over 72 images per second, as opposed 3 for the baseline, allowing for real-time applications.</p

    AIDE: Accelerating image-based ecological surveys with interactive machine learning

    No full text
    Ecological surveys increasingly rely on large-scale image datasets, typically terabytes of imagery for a single survey. The ability to collect this volume of data allows surveys of unprecedented scale, at the cost of expansive volumes of photo-interpretation labour. We present Annotation Interface for Data-driven Ecology (AIDE), an open-source web framework designed to alleviate the task of image annotation for ecological surveys. AIDE employs an easy-to-use and customisable labelling interface that supports multiple users, database storage and scalability to the cloud and/or multiple machines. Moreover, AIDE closely integrates users and machine learning models into a feedback loop, where user-provided annotations are employed to re-train the model, and the latter is applied over unlabelled images to e.g. identify wildlife. These predictions are then presented to the users in optimised order, according to a customisable active learning criterion. AIDE has a number of deep learning models built-in, but also accepts custom model implementations. Annotation Interface for Data-driven Ecology has the potential to greatly accelerate annotation tasks for a wide range of researches employing image data. AIDE is open-source and can be downloaded for free at https://github.com/microsoft/aerial_wildlife_detection.</p
    corecore