12 research outputs found

    Semantic Segmentation of Remote Sensing Images with Sparse Annotations

    Full text link
    Training Convolutional Neural Networks (CNNs) for very high resolution images requires a large quantity of high-quality pixel-level annotations, which is extremely labor- and time-consuming to produce. Moreover, professional photo interpreters might have to be involved for guaranteeing the correctness of annotations. To alleviate such a burden, we propose a framework for semantic segmentation of aerial images based on incomplete annotations, where annotators are asked to label a few pixels with easy-to-draw scribbles. To exploit these sparse scribbled annotations, we propose the FEature and Spatial relaTional regulArization (FESTA) method to complement the supervised task with an unsupervised learning signal that accounts for neighbourhood structures both in spatial and feature terms

    Optimization of Rooftop Delineation from Aerial Imagery with Deep Learning

    Get PDF
    High-definition (HD) maps of building rooftops or footprints are important for urban application and disaster management. Rapid creation of such HD maps through rooftop delineation at the city scale using high-resolution satellite and aerial images with deep leaning methods has become feasible and draw much attention. In the context of rooftop delineation, the end-to-end Deep Convolutional Neural Networks (DCNNs) have demonstrated remarkable performance in accurately delineating rooftops from aerial imagery. However, several challenges still exist in this task, which are addressed in this thesis. These challenges include: (1) the generalization issues of models when test data differ from training data, (2) the scale-variance issues in rooftop delineation, and (3) the high cost of annotating accurate rooftop boundaries. To address the challenges mentioned above, this thesis proposes three novel deep learning-based methods. Firstly, a super-resolution network named Momentum and Spatial-Channel Attention Residual Feature Aggregation Network (MSCA-RFANet) is proposed to tackle the generalization issue. The proposed super-resolution network shows better performance compared to its baseline and other state-of-the-art methods. In addition, data composition with MSCA-RFANet shows high performance on dealing with the generalization issues. Secondly, an end-to-end rooftop delineation network named Higher Resolution Network with Dynamic Scale Training (HigherNet-DST) is developed to mitigate the scale-variance issue. The experimental results on publicly available building datasets demonstrate that HigherNet-DST achieves competitive performance in rooftop delineation, particularly excelling in accurately delineating small buildings. Lastly, a weakly supervised deep learning network named Box2Boundary is developed to reduce the annotation cost. The experimental results show that Box2Boundary with post processing is effective in dealing with the cost annotation issues with decent performance. Consequently, the research with these three sub-topics and the three resulting papers are thought to hold potential implications for various practical applications

    Reducing the Burden of Aerial Image Labelling Through Human-in-the-Loop Machine Learning Methods

    Get PDF
    This dissertation presents an introduction to human-in-the-loop deep learning methods for remote sensing applications. It is motivated by the need to decrease the time spent by volunteers on semantic segmentation of remote sensing imagery. We look at two human-in-the-loop approaches of speeding up the labelling of the remote sensing data: interactive segmentation and active learning. We develop these methods specifically in response to the needs of the disaster relief organisations who require accurately labelled maps of disaster-stricken regions quickly, in order to respond to the needs of the affected communities. To begin, we survey the current approaches used within the field. We analyse the shortcomings of these models which include outputs ill-suited for uploading to mapping databases, and an inability to label new regions well, when the new regions differ from the regions trained on. The methods developed then look at addressing these shortcomings. We first develop an interactive segmentation algorithm. Interactive segmentation aims to segment objects with a supervisory signal from a user to assist the model. Work within interactive segmentation has focused largely on segmenting one or few objects within an image. We make a few adaptions to allow an existing method to scale to remote sensing applications where there are tens of objects within a single image that needs to be segmented. We show a quantitative improvements of up to 18% in mean intersection over union, as well as qualitative improvements. The algorithm works well when labelling new regions, and the qualitative improvements show outputs more suitable for uploading to mapping databases. We then investigate active learning in the context of remote sensing. Active learning looks at reducing the number of labelled samples required by a model to achieve an acceptable performance level. Within the context of deep learning, the utility of the various active learning strategies developed is uncertain, with conflicting results within the literature. We evaluate and compare a variety of sample acquisition strategies on the semantic segmentation tasks in scenarios relevant to disaster relief mapping. Our results show that all active learning strategies evaluated provide minimal performance increases over a simple random sample acquisition strategy. However, we present analysis of the results illustrating how the various strategies work and intuition of when certain active learning strategies might be preferred. This analysis could be used to inform future research. We conclude by providing examples of the synergies of these two approaches, and indicate how this work, on reducing the burden of aerial image labelling for the disaster relief mapping community, can be further extended

    Deep Learning for Aerial Scene Understanding in High Resolution Remote Sensing Imagery from the Lab to the Wild

    Get PDF
    Diese Arbeit präsentiert die Anwendung von Deep Learning beim Verständnis von Luftszenen, z. B. Luftszenenerkennung, Multi-Label-Objektklassifizierung und semantische Segmentierung. Abgesehen vom Training tiefer Netzwerke unter Laborbedingungen bietet diese Arbeit auch Lernstrategien für praktische Szenarien, z. B. werden Daten ohne Einschränkungen gesammelt oder Annotationen sind knapp

    A review of technical factors to consider when designing neural networks for semantic segmentation of Earth Observation imagery

    Full text link
    Semantic segmentation (classification) of Earth Observation imagery is a crucial task in remote sensing. This paper presents a comprehensive review of technical factors to consider when designing neural networks for this purpose. The review focuses on Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Generative Adversarial Networks (GANs), and transformer models, discussing prominent design patterns for these ANN families and their implications for semantic segmentation. Common pre-processing techniques for ensuring optimal data preparation are also covered. These include methods for image normalization and chipping, as well as strategies for addressing data imbalance in training samples, and techniques for overcoming limited data, including augmentation techniques, transfer learning, and domain adaptation. By encompassing both the technical aspects of neural network design and the data-related considerations, this review provides researchers and practitioners with a comprehensive and up-to-date understanding of the factors involved in designing effective neural networks for semantic segmentation of Earth Observation imagery.Comment: 145 pages with 32 figure

    A Review of Landcover Classification with Very-High Resolution Remotely Sensed Optical Images—Analysis Unit, Model Scalability and Transferability

    Get PDF
    As an important application in remote sensing, landcover classification remains one of the most challenging tasks in very-high-resolution (VHR) image analysis. As the rapidly increasing number of Deep Learning (DL) based landcover methods and training strategies are claimed to be the state-of-the-art, the already fragmented technical landscape of landcover mapping methods has been further complicated. Although there exists a plethora of literature review work attempting to guide researchers in making an informed choice of landcover mapping methods, the articles either focus on the review of applications in a specific area or revolve around general deep learning models, which lack a systematic view of the ever advancing landcover mapping methods. In addition, issues related to training samples and model transferability have become more critical than ever in an era dominated by data-driven approaches, but these issues were addressed to a lesser extent in previous review articles regarding remote sensing classification. Therefore, in this paper, we present a systematic overview of existing methods by starting from learning methods and varying basic analysis units for landcover mapping tasks, to challenges and solutions on three aspects of scalability and transferability with a remote sensing classification focus including (1) sparsity and imbalance of data; (2) domain gaps across different geographical regions; and (3) multi-source and multi-view fusion. We discuss in detail each of these categorical methods and draw concluding remarks in these developments and recommend potential directions for the continued endeavor

    A Review of Landcover Classification with Very-High Resolution Remotely Sensed Optical Images—Analysis Unit, Model Scalability and Transferability

    Get PDF
    As an important application in remote sensing, landcover classification remains one of the most challenging tasks in very-high-resolution (VHR) image analysis. As the rapidly increasing number of Deep Learning (DL) based landcover methods and training strategies are claimed to be the state-of-the-art, the already fragmented technical landscape of landcover mapping methods has been further complicated. Although there exists a plethora of literature review work attempting to guide researchers in making an informed choice of landcover mapping methods, the articles either focus on the review of applications in a specific area or revolve around general deep learning models, which lack a systematic view of the ever advancing landcover mapping methods. In addition, issues related to training samples and model transferability have become more critical than ever in an era dominated by data-driven approaches, but these issues were addressed to a lesser extent in previous review articles regarding remote sensing classification. Therefore, in this paper, we present a systematic overview of existing methods by starting from learning methods and varying basic analysis units for landcover mapping tasks, to challenges and solutions on three aspects of scalability and transferability with a remote sensing classification focus including (1) sparsity and imbalance of data; (2) domain gaps across different geographical regions; and (3) multi-source and multi-view fusion. We discuss in detail each of these categorical methods and draw concluding remarks in these developments and recommend potential directions for the continued endeavor

    Scalable Surface Water Mapping up to Fine-scale using Geometric Features of Water from Topographic Airborne LiDAR Data

    Full text link
    Despite substantial technological advancements, the comprehensive mapping of surface water, particularly smaller bodies (<1ha), continues to be a challenge due to a lack of robust, scalable methods. Standard methods require either training labels or site-specific parameter tuning, which complicates automated mapping and introduces biases related to training data and parameters. The reliance on water's reflectance properties, including LiDAR intensity, further complicates the matter, as higher-resolution images inherently produce more noise. To mitigate these difficulties, we propose a unique method that focuses on the geometric characteristics of water instead of its variable reflectance properties. Unlike preceding approaches, our approach relies entirely on 3D coordinate observations from airborne LiDAR data, taking advantage of the principle that connected surface water remains flat due to gravity. By harnessing this natural law in conjunction with connectivity, our method can accurately and scalably identify small water bodies, eliminating the need for training labels or repetitive parameter tuning. Consequently, our approach enables the creation of comprehensive 3D topographic maps that include both water and terrain, all performed in an unsupervised manner using only airborne laser scanning data, potentially enhancing the process of generating reliable 3D topographic maps. We validated our method across extensive and diverse landscapes, while comparing it to highly competitive Normalized Difference Water Index (NDWI)-based methods and assessing it using a reference surface water map. In conclusion, our method offers a new approach to address persistent difficulties in robust, scalable surface water mapping and 3D topographic mapping, using solely airborne LiDAR data

    Deep Learning of Unified Region, Edge, and Contour Models for Automated Image Segmentation

    Full text link
    Image segmentation is a fundamental and challenging problem in computer vision with applications spanning multiple areas, such as medical imaging, remote sensing, and autonomous vehicles. Recently, convolutional neural networks (CNNs) have gained traction in the design of automated segmentation pipelines. Although CNN-based models are adept at learning abstract features from raw image data, their performance is dependent on the availability and size of suitable training datasets. Additionally, these models are often unable to capture the details of object boundaries and generalize poorly to unseen classes. In this thesis, we devise novel methodologies that address these issues and establish robust representation learning frameworks for fully-automatic semantic segmentation in medical imaging and mainstream computer vision. In particular, our contributions include (1) state-of-the-art 2D and 3D image segmentation networks for computer vision and medical image analysis, (2) an end-to-end trainable image segmentation framework that unifies CNNs and active contour models with learnable parameters for fast and robust object delineation, (3) a novel approach for disentangling edge and texture processing in segmentation networks, and (4) a novel few-shot learning model in both supervised settings and semi-supervised settings where synergies between latent and image spaces are leveraged to learn to segment images given limited training data.Comment: PhD dissertation, UCLA, 202
    corecore