159 research outputs found

    BuildMapper: A Fully Learnable Framework for Vectorized Building Contour Extraction

    Full text link
    Deep learning based methods have significantly boosted the study of automatic building extraction from remote sensing images. However, delineating vectorized and regular building contours like a human does remains very challenging, due to the difficulty of the methodology, the diversity of building structures, and the imperfect imaging conditions. In this paper, we propose the first end-to-end learnable building contour extraction framework, named BuildMapper, which can directly and efficiently delineate building polygons just as a human does. BuildMapper consists of two main components: 1) a contour initialization module that generates initial building contours; and 2) a contour evolution module that performs both contour vertex deformation and reduction, which removes the need for complex empirical post-processing used in existing methods. In both components, we provide new ideas, including a learnable contour initialization method to replace the empirical methods, dynamic predicted and ground truth vertex pairing for the static vertex correspondence problem, and a lightweight encoder for vertex information extraction and aggregation, which benefit a general contour-based method; and a well-designed vertex classification head for building corner vertices detection, which casts light on direct structured building contour extraction. We also built a suitable large-scale building dataset, the WHU-Mix (vector) building dataset, to benefit the study of contour-based building extraction methods. The extensive experiments conducted on the WHU-Mix (vector) dataset, the WHU dataset, and the CrowdAI dataset verified that BuildMapper can achieve a state-of-the-art performance, with a higher mask average precision (AP) and boundary AP than both segmentation-based and contour-based methods

    Using Deep Neural Networks for Automatic Building Extraction with Boundary Regularization from Satellite Images

    Get PDF
    The building footprints from satellite images play a significant role in massive applications and many demand footprints with regularized boundaries, which are challenging to acquire. Recently, deep learning has made remarkable accomplishments in the remote sensing community. In this study, we formulate the major problems into spatial learning, semantic learning and geometric learning and propose a deep learning based framework to accomplish the building footprint extraction with boundary regularization. Our first two models, Post-Shape and Binary Space Partitioning Pooling Network (BSPPN) integrate polygon shape-prior into neural networks. The other one, Region-based Polygon GCN (R-PolyGCN) exploits graph convolutional networks to learn geometric polygon features. Extensive experiments show that our models can properly achieve object localization, recognition, semantic labeling and geometric shape extraction simultaneously. The model performances are competitive with the state-of-the-art baseline model, Mask R-CNN. Especially our R-PolyGCN, consistently outperforms others in all aspects

    Superpixel-Based Attention Graph Neural Network for Semantic Segmentation in Aerial Images

    Get PDF
    Semantic segmentation is one of the significant tasks in understanding aerial images with high spatial resolution. Recently, Graph Neural Network (GNN) and attention mechanism have achieved excellent performance in semantic segmentation tasks in general images and been applied to aerial images. In this paper, we propose a novel Superpixel-based Attention Graph Neural Network (SAGNN) for semantic segmentation of high spatial resolution aerial images. A K-Nearest Neighbor (KNN) graph is constructed from our network for each image, where each node corresponds to a superpixel in the image and is associated with a hidden representation vector. On this basis, the initialization of the hidden representation vector is the appearance feature extracted by a unary Convolutional Neural Network (CNN) from the image. Moreover, relying on the attention mechanism and recursive functions, each node can update its hidden representation according to the current state and the incoming information from its neighbors. The final representation of each node is used to predict the semantic class of each superpixel. The attention mechanism enables graph nodes to differentially aggregate neighbor information, which can extract higher-quality features. Furthermore, the superpixels not only save computational resources, but also maintain object boundary to achieve more accurate predictions. The accuracy of our model on the Potsdam and Vaihingen public datasets exceeds all benchmark approaches, reaching 90.23% and 89.32%, respectively
    • …
    corecore