908 research outputs found
Reducing the Burden of Aerial Image Labelling Through Human-in-the-Loop Machine Learning Methods
This dissertation presents an introduction to human-in-the-loop deep learning methods for remote sensing applications. It is motivated by the need to decrease the time spent by volunteers on semantic segmentation of remote sensing imagery. We look at two human-in-the-loop approaches of speeding up the labelling of the remote sensing data: interactive segmentation and active learning. We develop these methods specifically in response to the needs of the disaster relief organisations who require accurately labelled maps of disaster-stricken regions quickly, in order to respond to the needs of the affected communities. To begin, we survey the current approaches used within the field. We analyse the shortcomings of these models which include outputs ill-suited for uploading to mapping databases, and an inability to label new regions well, when the new regions differ from the regions trained on. The methods developed then look at addressing these shortcomings. We first develop an interactive segmentation algorithm. Interactive segmentation aims to segment objects with a supervisory signal from a user to assist the model. Work within interactive segmentation has focused largely on segmenting one or few objects within an image. We make a few adaptions to allow an existing method to scale to remote sensing applications where there are tens of objects within a single image that needs to be segmented. We show a quantitative improvements of up to 18% in mean intersection over union, as well as qualitative improvements. The algorithm works well when labelling new regions, and the qualitative improvements show outputs more suitable for uploading to mapping databases. We then investigate active learning in the context of remote sensing. Active learning looks at reducing the number of labelled samples required by a model to achieve an acceptable performance level. Within the context of deep learning, the utility of the various active learning strategies developed is uncertain, with conflicting results within the literature. We evaluate and compare a variety of sample acquisition strategies on the semantic segmentation tasks in scenarios relevant to disaster relief mapping. Our results show that all active learning strategies evaluated provide minimal performance increases over a simple random sample acquisition strategy. However, we present analysis of the results illustrating how the various strategies work and intuition of when certain active learning strategies might be preferred. This analysis could be used to inform future research. We conclude by providing examples of the synergies of these two approaches, and indicate how this work, on reducing the burden of aerial image labelling for the disaster relief mapping community, can be further extended
Map Generation from Large Scale Incomplete and Inaccurate Data Labels
Accurately and globally mapping human infrastructure is an important and
challenging task with applications in routing, regulation compliance
monitoring, and natural disaster response management etc.. In this paper we
present progress in developing an algorithmic pipeline and distributed compute
system that automates the process of map creation using high resolution aerial
images. Unlike previous studies, most of which use datasets that are available
only in a few cities across the world, we utilizes publicly available imagery
and map data, both of which cover the contiguous United States (CONUS). We
approach the technical challenge of inaccurate and incomplete training data
adopting state-of-the-art convolutional neural network architectures such as
the U-Net and the CycleGAN to incrementally generate maps with increasingly
more accurate and more complete labels of man-made infrastructure such as roads
and houses. Since scaling the mapping task to CONUS calls for parallelization,
we then adopted an asynchronous distributed stochastic parallel gradient
descent training scheme to distribute the computational workload onto a cluster
of GPUs with nearly linear speed-up.Comment: This paper is accepted by KDD 202
Enabling Decision-Support Systems through Automated Cell Tower Detection
Cell phone coverage and high-speed service gaps persist in rural areas in
sub-Saharan Africa, impacting public access to mobile-based financial,
educational, and humanitarian services. Improving maps of telecommunications
infrastructure can help inform strategies to eliminate gaps in mobile coverage.
Deep neural networks, paired with remote sensing images, can be used for object
detection of cell towers and eliminate the need for inefficient and burdensome
manual mapping to find objects over large geographic regions. In this study, we
demonstrate a partially automated workflow to train an object detection model
to locate cell towers using OpenStreetMap (OSM) features and high-resolution
Maxar imagery. For model fine-tuning and evaluation, we curated a diverse
dataset of over 6,000 unique images of cell towers in 26 countries in eastern,
southern, and central Africa using automatically generated annotations from OSM
points. Our model achieves an average precision at 50% Intersection over Union
(IoU) (AP@50) of 81.2 with good performance across different geographies and
out-of-sample testing. Accurate localization of cell towers can yield more
accurate cell coverage maps, in turn enabling improved delivery of digital
services for decision-support applications
Large-scale Weakly Supervised Learning for Road Extraction from Satellite Imagery
Automatic road extraction from satellite imagery using deep learning is a
viable alternative to traditional manual mapping. Therefore it has received
considerable attention recently. However, most of the existing methods are
supervised and require pixel-level labeling, which is tedious and error-prone.
To make matters worse, the earth has a diverse range of terrain, vegetation,
and man-made objects. It is well known that models trained in one area
generalize poorly to other areas. Various shooting conditions such as light and
angel, as well as different image processing techniques further complicate the
issue. It is impractical to develop training data to cover all image styles.
This paper proposes to leverage OpenStreetMap road data as weak labels and
large scale satellite imagery to pre-train semantic segmentation models. Our
extensive experimental results show that the prediction accuracy increases with
the amount of the weakly labeled data, as well as the road density in the areas
chosen for training. Using as much as 100 times more data than the widely used
DeepGlobe road dataset, our model with the D-LinkNet architecture and the
ResNet-50 backbone exceeds the top performer of the current DeepGlobe
leaderboard. Furthermore, due to large-scale pre-training, our model
generalizes much better than those trained with only the curated datasets,
implying great application potential
Intelligent Data Analytics using Deep Learning for Data Science
Nowadays, data science stimulates the interest of academics and practitioners because it can assist in the extraction of significant insights from massive amounts of data. From the years 2018 through 2025, the Global Datasphere is expected to rise from 33 Zettabytes to 175 Zettabytes, according to the International Data Corporation. This dissertation proposes an intelligent data analytics framework that uses deep learning to tackle several difficulties when implementing a data science application. These difficulties include dealing with high inter-class similarity, the availability and quality of hand-labeled data, and designing a feasible approach for modeling significant correlations in features gathered from various data sources. The proposed intelligent data analytics framework employs a novel strategy for improving data representation learning by incorporating supplemental data from various sources and structures. First, the research presents a multi-source fusion approach that utilizes confident learning techniques to improve the data quality from many noisy sources. Meta-learning methods based on advanced techniques such as the mixture of experts and differential evolution combine the predictive capacity of individual learners with a gating mechanism, ensuring that only the most trustworthy features or predictions are integrated to train the model. Then, a Multi-Level Convolutional Fusion is presented to train a model on the correspondence between local-global deep feature interactions to identify easily confused samples of different classes. The convolutional fusion is further enhanced with the power of Graph Transformers, aggregating the relevant neighboring features in graph-based input data structures and achieving state-of-the-art performance on a large-scale building damage dataset. Finally, weakly-supervised strategies, noise regularization, and label propagation are proposed to train a model on sparse input labeled data, ensuring the model\u27s robustness to errors and supporting the automatic expansion of the training set. The suggested approaches outperformed competing strategies in effectively training a model on a large-scale dataset of 500k photos, with just about 7% of the images annotated by a human. The proposed framework\u27s capabilities have benefited various data science applications, including fluid dynamics, geometric morphometrics, building damage classification from satellite pictures, disaster scene description, and storm-surge visualization
Detecting natural disasters, damage, and incidents in the wild
Responding to natural disasters, such as earthquakes, floods, and wildfires,
is a laborious task performed by on-the-ground emergency responders and
analysts. Social media has emerged as a low-latency data source to quickly
understand disaster situations. While most studies on social media are limited
to text, images offer more information for understanding disaster and incident
scenes. However, no large-scale image datasets for incident detection exists.
In this work, we present the Incidents Dataset, which contains 446,684 images
annotated by humans that cover 43 incidents across a variety of scenes. We
employ a baseline classification model that mitigates false-positive errors and
we perform image filtering experiments on millions of social media images from
Flickr and Twitter. Through these experiments, we show how the Incidents
Dataset can be used to detect images with incidents in the wild. Code, data,
and models are available online at http://incidentsdataset.csail.mit.edu.Comment: ECCV 202
Development after Displacement: Evaluating the Utility of OpenStreetMap Data for Monitoring Sustainable Development Goal Progress in Refugee Settlements
In 2015, 193 countries declared their commitment to “leave no one behind” in pursuit of 17 Sustainable Development Goals (SDGs). However, the world’s refugees have been routinely excluded from national censuses and representative surveys, and, as a result, have broadly been overlooked in SDG evaluations. In this study, we examine the potential of OpenStreetMap (OSM) data for monitoring SDG progress in refugee settlements. We collected all available OSM data in 28 refugee and 26 nearby non-refugee settlements in the major refugee-hosting country of Uganda. We created a novel SDG-OSM data model, measured the spatial and temporal coverages of SDG-relevant OSM data across refugee settlements, and compared these results to non-refugee settlements. We found 11 different SDGs represented across 92% (21,950) of OSM data in refugee settlements, compared to 78% (1919 nodes) in non-refugee settlements. However, most data were created three years after refugee arrival, and 81% of OSM data in refugee settlements were never edited, both of which limit the potential for long-term monitoring of SDG progress. In light of our findings, we offer suggestions for improving OSM-driven SDG monitoring in refugee settlements that have relevance for development and humanitarian practitioners and research communities alike
Deep Learning for Building Footprint Generation from Optical Imagery
Auf Deep Learning basierende Methoden haben vielversprechende Ergebnisse für die Aufgabe der Erstellung von Gebäudegrundrissen gezeigt, aber sie haben zwei inhärente Einschränkungen. Erstens zeigen die extrahierten Gebäude verschwommene Gebäudegrenzen und Klecksformen. Zweitens sind für das Netzwerktraining massive Annotationen auf Pixelebene erforderlich. Diese Dissertation hat eine Reihe von Methoden entwickelt, um die oben genannten Probleme anzugehen. Darüber hinaus werden die entwickelten Methoden in praktische Anwendungen umgesetzt
- …