5 research outputs found

    Multi-Path Region Mining For Weakly Supervised 3D Semantic Segmentation on Point Clouds

    Full text link
    Point clouds provide intrinsic geometric information and surface context for scene understanding. Existing methods for point cloud segmentation require a large amount of fully labeled data. Using advanced depth sensors, collection of large scale 3D dataset is no longer a cumbersome process. However, manually producing point-level label on the large scale dataset is time and labor-intensive. In this paper, we propose a weakly supervised approach to predict point-level results using weak labels on 3D point clouds. We introduce our multi-path region mining module to generate pseudo point-level label from a classification network trained with weak labels. It mines the localization cues for each class from various aspects of the network feature using different attention modules. Then, we use the point-level pseudo labels to train a point cloud segmentation network in a fully supervised manner. To the best of our knowledge, this is the first method that uses cloud-level weak labels on raw 3D space to train a point cloud semantic segmentation network. In our setting, the 3D weak labels only indicate the classes that appeared in our input sample. We discuss both scene- and subcloud-level weakly labels on raw 3D point cloud data and perform in-depth experiments on them. On ScanNet dataset, our result trained with subcloud-level labels is compatible with some fully supervised methods.Comment: Accepted by CVPR202

    Imbalance Knowledge-Driven Multi-modal Network for Land-Cover Semantic Segmentation Using Images and LiDAR Point Clouds

    Full text link
    Despite the good results that have been achieved in unimodal segmentation, the inherent limitations of individual data increase the difficulty of achieving breakthroughs in performance. For that reason, multi-modal learning is increasingly being explored within the field of remote sensing. The present multi-modal methods usually map high-dimensional features to low-dimensional spaces as a preprocess before feature extraction to address the nonnegligible domain gap, which inevitably leads to information loss. To address this issue, in this paper we present our novel Imbalance Knowledge-Driven Multi-modal Network (IKD-Net) to extract features from raw multi-modal heterogeneous data directly. IKD-Net is capable of mining imbalance information across modalities while utilizing a strong modal to drive the feature map refinement of the weaker ones in the global and categorical perspectives by way of two sophisticated plug-and-play modules: the Global Knowledge-Guided (GKG) and Class Knowledge-Guided (CKG) gated modules. The whole network then is optimized using a holistic loss function. While we were developing IKD-Net, we also established a new dataset called the National Agriculture Imagery Program and 3D Elevation Program Combined dataset in California (N3C-California), which provides a particular benchmark for multi-modal joint segmentation tasks. In our experiments, IKD-Net outperformed the benchmarks and state-of-the-art methods both in the N3C-California and the small-scale ISPRS Vaihingen dataset. IKD-Net has been ranked first on the real-time leaderboard for the GRSS DFC 2018 challenge evaluation until this paper's submission

    Review of Automatic Processing of Topography and Surface Feature Identification LiDAR Data Using Machine Learning Techniques

    Get PDF
    Machine Learning (ML) applications on Light Detection And Ranging (LiDAR) data have provided promising results and thus this topic has been widely addressed in the literature during the last few years. This paper reviews the essential and the more recent completed studies in the topography and surface feature identification domain. Four areas, with respect to the suggested approaches, have been analyzed and discussed: the input data, the concepts of point cloud structure for applying ML, the ML techniques used, and the applications of ML on LiDAR data. Then, an overview is provided to underline the advantages and the disadvantages of this research axis. Despite the training data labelling problem, the calculation cost, and the undesirable shortcutting due to data downsampling, most of the proposed methods use supervised ML concepts to classify the downsampled LiDAR data. Furthermore, despite the occasional highly accurate results, in most cases the results still require filtering. In fact, a considerable number of adopted approaches use the same data structure concepts employed in image processing to profit from available informatics tools. Knowing that the LiDAR point clouds represent rich 3D data, more effort is needed to develop specialized processing tools
    corecore