33 research outputs found
Building Extraction from Remote Sensing Images via an Uncertainty-Aware Network
Building extraction aims to segment building pixels from remote sensing
images and plays an essential role in many applications, such as city planning
and urban dynamic monitoring. Over the past few years, deep learning methods
with encoder-decoder architectures have achieved remarkable performance due to
their powerful feature representation capability. Nevertheless, due to the
varying scales and styles of buildings, conventional deep learning models
always suffer from uncertain predictions and cannot accurately distinguish the
complete footprints of the building from the complex distribution of ground
objects, leading to a large degree of omission and commission. In this paper,
we realize the importance of uncertain prediction and propose a novel and
straightforward Uncertainty-Aware Network (UANet) to alleviate this problem. To
verify the performance of our proposed UANet, we conduct extensive experiments
on three public building datasets, including the WHU building dataset, the
Massachusetts building dataset, and the Inria aerial image dataset. Results
demonstrate that the proposed UANet outperforms other state-of-the-art
algorithms by a large margin
Optimization of Rooftop Delineation from Aerial Imagery with Deep Learning
High-definition (HD) maps of building rooftops or footprints are important for urban application and disaster management. Rapid creation of such HD maps through rooftop delineation at the city scale using high-resolution satellite and aerial images with deep leaning methods has become feasible and draw much attention. In the context of rooftop delineation, the end-to-end Deep Convolutional Neural Networks (DCNNs) have demonstrated remarkable performance in accurately delineating rooftops from aerial imagery. However, several challenges still exist in this task, which are addressed in this thesis. These challenges include: (1) the generalization issues of models when test data differ from training data, (2) the scale-variance issues in rooftop delineation, and (3) the high cost of annotating accurate rooftop boundaries.
To address the challenges mentioned above, this thesis proposes three novel deep learning-based methods. Firstly, a super-resolution network named Momentum and Spatial-Channel Attention Residual Feature Aggregation Network (MSCA-RFANet) is proposed to tackle the generalization issue. The proposed super-resolution network shows better performance compared to its baseline and other state-of-the-art methods. In addition, data composition with MSCA-RFANet shows high performance on dealing with the generalization issues. Secondly, an end-to-end rooftop delineation network named Higher Resolution Network with Dynamic Scale Training (HigherNet-DST) is developed to mitigate the scale-variance issue. The experimental results on publicly available building datasets demonstrate that HigherNet-DST achieves competitive performance in rooftop delineation, particularly excelling in accurately delineating small buildings. Lastly, a weakly supervised deep learning network named Box2Boundary is developed to reduce the annotation cost. The experimental results show that Box2Boundary with post processing is effective in dealing with the cost annotation issues with decent performance. Consequently, the research with these three sub-topics and the three resulting papers are thought to hold potential implications for various practical applications
Automated Building Information Extraction and Evaluation from High-resolution Remotely Sensed Data
The two-dimensional (2D) footprints and three-dimensional (3D) structures of buildings are of great importance to city planning, natural disaster management, and virtual environmental simulation. As traditional manual methodologies for collecting 2D and 3D building information are often both time consuming and costly, automated methods are required for efficient large area mapping. It is challenging to extract building information from remotely sensed data, considering the complex nature of urban environments and their associated intricate building structures.
Most 2D evaluation methods are focused on classification accuracy, while other dimensions of extraction accuracy are ignored. To assess 2D building extraction methods, a multi-criteria evaluation system has been designed. The proposed system consists of matched rate, shape similarity, and positional accuracy. Experimentation with four methods demonstrates that the proposed multi-criteria system is more comprehensive and effective, in comparison with traditional accuracy assessment metrics.
Building height is critical for building 3D structure extraction. As data sources for height estimation, digital surface models (DSMs) that are derived from stereo images using existing software typically provide low accuracy results in terms of rooftop elevations. Therefore, a new image matching method is proposed by adding building footprint maps as constraints. Validation demonstrates that the proposed matching method can estimate building rooftop elevation with one third of the error encountered when using current commercial software.
With an ideal input DSM, building height can be estimated by the elevation contrast inside and outside a building footprint. However, occlusions and shadows cause indistinct building edges in the DSMs generated from stereo images. Therefore, a “building-ground elevation difference model” (EDM) has been designed, which describes the trend of the elevation difference between a building and its neighbours, in order to find elevation values at bare ground. Experiments using this novel approach report that estimated building height with 1.5m residual, which out-performs conventional filtering methods.
Finally, 3D buildings are digitally reconstructed and evaluated. Current 3D evaluation methods did not present the difference between 2D and 3D evaluation methods well; traditionally, wall accuracy is ignored. To address these problems, this thesis designs an evaluation system with three components: volume, surface, and point. As such, the resultant multi-criteria system provides an improved evaluation method for building reconstruction
IM2ELEVATION: Building Height Estimation from Single-View Aerial Imagery
Estimation of the Digital Surface Model (DSM) and building heights from single-view aerial
imagery is a challenging inherently ill-posed problem that we address in this paper by resorting to
machine learning. We propose an end-to-end trainable convolutional-deconvolutional deep neural
network architecture that enables learning mapping from a single aerial imagery to a DSM for
analysis of urban scenes. We perform multisensor fusion of aerial optical and aerial light detection
and ranging (Lidar) data to prepare the training data for our pipeline. The dataset quality is key to
successful estimation performance. Typically, a substantial amount of misregistration artifacts are
present due to georeferencing/projection errors, sensor calibration inaccuracies, and scene changes
between acquisitions. To overcome these issues, we propose a registration procedure to improve Lidar
and optical data alignment that relies on Mutual Information, followed by Hough transform-based
validation step to adjust misregistered image patches. We validate our building height estimation
model on a high-resolution dataset captured over central Dublin, Ireland: Lidar point cloud of
2015 and optical aerial images from 2017. These data allow us to validate the proposed registration
procedure and perform 3D model reconstruction from single-view aerial imagery. We also report
state-of-the-art performance of our proposed architecture on several popular DSM estimation datasets
Machine learning for improved detection and segmentation of building boundary
The first step in rescuing and mitigating the losses from natural or man-made disasters is to assess damaged assets, including services, utilities and infrastructure,
such as buildings. However, manual visual analysis of the affected buildings can be time consuming and labour intensive. Automatic detection of buildings, on the other hand, has the potential to overcome the limitations of conventional approaches. This thesis reviews the existing methods for the automated detection of objects using multi-source geospatial data and presents two novel post processing techniques. Effective building segmentation and recognition techniques are also investigated. Artificial intelligence techniques have been used to identify building
boundaries in automated building-detection applications. Compared with other neural network models, the convolutional neural network (CNN) architectures based on supervised and unsupervised approaches provide better results by looking at the image details as spatial information of the entity in the frame. This research incorporates the improved semantic detection ability of Region-based Convolutional Neural Network (Mask R-CNN) and the segmentation refining capability of the conditional random field (CRF)s. Mask R-CNN uses a pre-trained network to recognise the boundary boxes around buildings. It also provides contour key points around buildings that are masked in satellite images. This thesis proposes two novel post-processing techniques that operate by modifying and detecting the building’s relative orientation properties and combining the key points predicted by the two head neural networks to modify the predicted contour with the help of the proposed novel snap algorithms. The results show significant improvements in the accuracy of boundary detection compared with the state-ofthe-art techniques of 2.5%, 4.6% and 1% for F1-Score, Intersection over Union also known as Jacard coefficient (IoU), and overall pixel accuracy, respectively.
CNNs have proven to be powerful tools for a wide range of image processing tasks where they can be used to automatically learn mid-level and high-level concepts
from raw data, such as images. Finally, the results highlight the potential of further approaches to these applications, such as the planning of infrastructure
Recent Advances in Image Restoration with Applications to Real World Problems
In the past few decades, imaging hardware has improved tremendously in terms of resolution, making widespread usage of images in many diverse applications on Earth and planetary missions. However, practical issues associated with image acquisition are still affecting image quality. Some of these issues such as blurring, measurement noise, mosaicing artifacts, low spatial or spectral resolution, etc. can seriously affect the accuracy of the aforementioned applications. This book intends to provide the reader with a glimpse of the latest developments and recent advances in image restoration, which includes image super-resolution, image fusion to enhance spatial, spectral resolution, and temporal resolutions, and the generation of synthetic images using deep learning techniques. Some practical applications are also included
Deep Learning for Building Footprint Generation from Optical Imagery
Auf Deep Learning basierende Methoden haben vielversprechende Ergebnisse für die Aufgabe der Erstellung von Gebäudegrundrissen gezeigt, aber sie haben zwei inhärente Einschränkungen. Erstens zeigen die extrahierten Gebäude verschwommene Gebäudegrenzen und Klecksformen. Zweitens sind für das Netzwerktraining massive Annotationen auf Pixelebene erforderlich. Diese Dissertation hat eine Reihe von Methoden entwickelt, um die oben genannten Probleme anzugehen. Darüber hinaus werden die entwickelten Methoden in praktische Anwendungen umgesetzt
Reducing the Burden of Aerial Image Labelling Through Human-in-the-Loop Machine Learning Methods
This dissertation presents an introduction to human-in-the-loop deep learning methods for remote sensing applications. It is motivated by the need to decrease the time spent by volunteers on semantic segmentation of remote sensing imagery. We look at two human-in-the-loop approaches of speeding up the labelling of the remote sensing data: interactive segmentation and active learning. We develop these methods specifically in response to the needs of the disaster relief organisations who require accurately labelled maps of disaster-stricken regions quickly, in order to respond to the needs of the affected communities. To begin, we survey the current approaches used within the field. We analyse the shortcomings of these models which include outputs ill-suited for uploading to mapping databases, and an inability to label new regions well, when the new regions differ from the regions trained on. The methods developed then look at addressing these shortcomings. We first develop an interactive segmentation algorithm. Interactive segmentation aims to segment objects with a supervisory signal from a user to assist the model. Work within interactive segmentation has focused largely on segmenting one or few objects within an image. We make a few adaptions to allow an existing method to scale to remote sensing applications where there are tens of objects within a single image that needs to be segmented. We show a quantitative improvements of up to 18% in mean intersection over union, as well as qualitative improvements. The algorithm works well when labelling new regions, and the qualitative improvements show outputs more suitable for uploading to mapping databases. We then investigate active learning in the context of remote sensing. Active learning looks at reducing the number of labelled samples required by a model to achieve an acceptable performance level. Within the context of deep learning, the utility of the various active learning strategies developed is uncertain, with conflicting results within the literature. We evaluate and compare a variety of sample acquisition strategies on the semantic segmentation tasks in scenarios relevant to disaster relief mapping. Our results show that all active learning strategies evaluated provide minimal performance increases over a simple random sample acquisition strategy. However, we present analysis of the results illustrating how the various strategies work and intuition of when certain active learning strategies might be preferred. This analysis could be used to inform future research. We conclude by providing examples of the synergies of these two approaches, and indicate how this work, on reducing the burden of aerial image labelling for the disaster relief mapping community, can be further extended