204 research outputs found

    Expediting Building Footprint Segmentation from High-resolution Remote Sensing Images via progressive lenient supervision

    Full text link
    The efficacy of building footprint segmentation from remotely sensed images has been hindered by model transfer effectiveness. Many existing building segmentation methods were developed upon the encoder-decoder architecture of U-Net, in which the encoder is finetuned from the newly developed backbone networks that are pre-trained on ImageNet. However, the heavy computational burden of the existing decoder designs hampers the successful transfer of these modern encoder networks to remote sensing tasks. Even the widely-adopted deep supervision strategy fails to mitigate these challenges due to its invalid loss in hybrid regions where foreground and background pixels are intermixed. In this paper, we conduct a comprehensive evaluation of existing decoder network designs for building footprint segmentation and propose an efficient framework denoted as BFSeg to enhance learning efficiency and effectiveness. Specifically, a densely-connected coarse-to-fine feature fusion decoder network that facilitates easy and fast feature fusion across scales is proposed. Moreover, considering the invalidity of hybrid regions in the down-sampled ground truth during the deep supervision process, we present a lenient deep supervision and distillation strategy that enables the network to learn proper knowledge from deep supervision. Building upon these advancements, we have developed a new family of building segmentation networks, which consistently surpass prior works with outstanding performance and efficiency across a wide range of newly developed encoder networks. The code will be released on https://github.com/HaonanGuo/BFSeg-Efficient-Building-Footprint-Segmentation-Framework.Comment: 13 pages,8 figures. Submitted to IEEE Transactions on Neural Networks and Learning System

    PiCoCo: Pixelwise Contrast and Consistency Learning for Semisupervised Building Footprint Segmentation

    Get PDF
    Building footprint segmentation from high-resolution remote sensing (RS) images plays a vital role in urban planning, disaster response, and population density estimation. Convolutional neural networks (CNNs) have been recently used as a workhorse for effectively generating building footprints. However, to completely exploit the prediction power of CNNs, large-scale pixel-level annotations are required. Most state-of-the-art methods based on CNNs are focused on the design of network architectures for improving the predictions of building footprints with full annotations, while few works have been done on building footprint segmentation with limited annotations. In this article, we propose a novel semisupervised learning method for building footprint segmentation, which can effectively predict building footprints based on the network trained with few annotations (e.g., only 0.0324 km2 out of 2.25-km2 area is labeled). The proposed method is based on investigating the contrast between the building and background pixels in latent space and the consistency of predictions obtained from the CNN models when the input RS images are perturbed. Thus, we term the proposed semisupervised learning framework of building footprint segmentation as PiCoCo, which is based on the enforcement of Pixelwise Contrast and Consistency during the learning phase. Our experiments, conducted on two benchmark building segmentation datasets, validate the effectiveness of our proposed framework as compared to several state-of-the-art building footprint extraction and semisupervised semantic segmentation methods

    Deep Learning for Building Footprint Generation from Optical Imagery

    Get PDF
    Auf Deep Learning basierende Methoden haben vielversprechende Ergebnisse für die Aufgabe der Erstellung von Gebäudegrundrissen gezeigt, aber sie haben zwei inhärente Einschränkungen. Erstens zeigen die extrahierten Gebäude verschwommene Gebäudegrenzen und Klecksformen. Zweitens sind für das Netzwerktraining massive Annotationen auf Pixelebene erforderlich. Diese Dissertation hat eine Reihe von Methoden entwickelt, um die oben genannten Probleme anzugehen. Darüber hinaus werden die entwickelten Methoden in praktische Anwendungen umgesetzt

    Reducing the Burden of Aerial Image Labelling Through Human-in-the-Loop Machine Learning Methods

    Get PDF
    This dissertation presents an introduction to human-in-the-loop deep learning methods for remote sensing applications. It is motivated by the need to decrease the time spent by volunteers on semantic segmentation of remote sensing imagery. We look at two human-in-the-loop approaches of speeding up the labelling of the remote sensing data: interactive segmentation and active learning. We develop these methods specifically in response to the needs of the disaster relief organisations who require accurately labelled maps of disaster-stricken regions quickly, in order to respond to the needs of the affected communities. To begin, we survey the current approaches used within the field. We analyse the shortcomings of these models which include outputs ill-suited for uploading to mapping databases, and an inability to label new regions well, when the new regions differ from the regions trained on. The methods developed then look at addressing these shortcomings. We first develop an interactive segmentation algorithm. Interactive segmentation aims to segment objects with a supervisory signal from a user to assist the model. Work within interactive segmentation has focused largely on segmenting one or few objects within an image. We make a few adaptions to allow an existing method to scale to remote sensing applications where there are tens of objects within a single image that needs to be segmented. We show a quantitative improvements of up to 18% in mean intersection over union, as well as qualitative improvements. The algorithm works well when labelling new regions, and the qualitative improvements show outputs more suitable for uploading to mapping databases. We then investigate active learning in the context of remote sensing. Active learning looks at reducing the number of labelled samples required by a model to achieve an acceptable performance level. Within the context of deep learning, the utility of the various active learning strategies developed is uncertain, with conflicting results within the literature. We evaluate and compare a variety of sample acquisition strategies on the semantic segmentation tasks in scenarios relevant to disaster relief mapping. Our results show that all active learning strategies evaluated provide minimal performance increases over a simple random sample acquisition strategy. However, we present analysis of the results illustrating how the various strategies work and intuition of when certain active learning strategies might be preferred. This analysis could be used to inform future research. We conclude by providing examples of the synergies of these two approaches, and indicate how this work, on reducing the burden of aerial image labelling for the disaster relief mapping community, can be further extended

    Building Section Instance Segmentation with Combined Classical and Deep Learning Methods

    Get PDF
    In big cities, the complexity of urban infrastructure is very high. In city centers, one construction can consist of several building sections of different heights or roof geometries. Most of the existing approaches detect those buildings as a single construction in the form of binary building segmentation maps or as one instance of object-oriented segmentation. However, reconstructing complex buildings consisting of several parts requires a higher level of detail. In this work, we present a methodology for individual building section instance segmentation on satellite imagery. We show that fully convolutional networks (FCNs) can tackle the issue much better than the state-of-the-art Mask-RCNN. A ground truth raster image with pixel value 1 for building sections and 2 for their touching borders was generated to train models on predicting both classes as a semantic output. The semantic outputs were then post-processed with the help of morphology and watershed labeling to generate segmentation on the instance level. The combination of a deep learning-based approach and a classical image processing algorithm allowed us to fulfill the segmentation task on the instance level and reach high-quality results with an mAP of up to 42 %
    • …
    corecore