2 research outputs found

    Deep Learning Guided Building Reconstruction from Satellite Imagery-derived Point Clouds

    Full text link
    3D urban reconstruction of buildings from remotely sensed imagery has drawn significant attention during the past two decades. While aerial imagery and LiDAR provide higher resolution, satellite imagery is cheaper and more efficient to acquire for large scale need. However, the high, orbital altitude of satellite observation brings intrinsic challenges, like unpredictable atmospheric effect, multi view angles, significant radiometric differences due to the necessary multiple views, diverse land covers and urban structures in a scene, small base-height ratio or narrow field of view, all of which may degrade 3D reconstruction quality. To address these major challenges, we present a reliable and effective approach for building model reconstruction from the point clouds generated from multi-view satellite images. We utilize multiple types of primitive shapes to fit the input point cloud. Specifically, a deep-learning approach is adopted to distinguish the shape of building roofs in complex and yet noisy scenes. For points that belong to the same roof shape, a multi-cue, hierarchical RANSAC approach is proposed for efficient and reliable segmenting and reconstructing the building point cloud. Experimental results over four selected urban areas (0.34 to 2.04 sq km in size) demonstrate the proposed method can generate detailed roof structures under noisy data environments. The average successful rate for building shape recognition is 83.0%, while the overall completeness and correctness are over 70% with reference to ground truth created from airborne lidar. As the first effort to address the public need of large scale city model generation, the development is deployed as open source software

    Towards Single Stage Weakly Supervised Semantic Segmentation

    Full text link
    The costly process of obtaining semantic segmentation labels has driven research towards weakly supervised semantic segmentation (WSSS) methods, using only image-level, point, or box labels. The lack of dense scene representation requires methods to increase complexity to obtain additional semantic information about the scene, often done through multiple stages of training and refinement. Current state-of-the-art (SOTA) models leverage image-level labels to produce class activation maps (CAMs) which go through multiple stages of refinement before they are thresholded to make pseudo-masks for supervision. The multi-stage approach is computationally expensive, and dependency on image-level labels for CAMs generation lacks generalizability to more complex scenes. In contrary, our method offers a single-stage approach generalizable to arbitrary dataset, that is trainable from scratch, without any dependency on pre-trained backbones, classification, or separate refinement tasks. We utilize point annotations to generate reliable, on-the-fly pseudo-masks through refined and filtered features. While our method requires point annotations that are only slightly more expensive than image-level annotations, we are to demonstrate SOTA performance on benchmark datasets (PascalVOC 2012), as well as significantly outperform other SOTA WSSS methods on recent real-world datasets (CRAID, CityPersons, IAD)
    corecore