3 research outputs found

    Selecting Optimal Combination of Data Channels for Semantic Segmentation in City Information Modelling (CIM)

    Get PDF
    Over the last decade, a 3D reconstruction technique has been developed to present the latest as-is information for various objects and build the city information models. Meanwhile, deep learning based approaches are employed to add semantic information to the models. Studies have proved that the accuracy of the model could be improved by combining multiple data channels (e.g., XYZ, Intensity, D, and RGB). Nevertheless, the redundant data channels in large-scale datasets may cause high computation cost and time during data processing. Few researchers have addressed the question of which combination of channels is optimal in terms of overall accuracy (OA) and mean intersection over union (mIoU). Therefore, a framework is proposed to explore an efficient data fusion approach for semantic segmentation by selecting an optimal combination of data channels. In the framework, a total of 13 channel combinations are investigated to pre-process data and the encoder-to-decoder structure is utilized for network permutations. A case study is carried out to investigate the efficiency of the proposed approach by adopting a city-level benchmark dataset and applying nine networks. It is found that the combination of IRGB channels provide the best OA performance, while IRGBD channels provide the best mIoU performance.</jats:p

    Image-based Semantic Segmentation of Large-scale Terrestrial Laser Scanning Point Clouds

    Get PDF
    Large-scale point cloud data acquired using terrestrial laser scanning (TLS) often need to be semantically segmented to support many applications. To this end, various three-dimensional (3D) methods and two-dimensional (i.e., image-based) methods have been developed. For large-scale point cloud data, 3D methods often require extensive computational effort. In contrast, image-based methods are favourable from the perspective of computational efficiency. However, the semantic segmentation accuracy achieved by existing image-based methods is significantly lower than that achieved by 3D methods. On this basis, the aim of this PhD thesis is to improve the accuracy of image-based semantic segmentation methods for TLS point cloud data while maintaining its relatively high efficiency. In this thesis, the optimal combination of commonly used features was first found, and an efficient manual feature selection method was proposed. It was found that existing image-based methods are highly dependent on colour information and do not provide an effective means of representing and utilising geometric features of scenes in images. To address this problem, an image enhancement method was developed to reveal the local geometric features in images derived by the projection of point cloud coordinates. Subsequently, to better utilise neural network models that are pre-trained on three-channel (i.e., RGB) image datasets, a feature extraction method (LC-Net) and a feature selection method (OSTA) were developed to reduce the higher dimension of image-based features to three. Finally, a stacking-based semantic segmentation (SBSS) framework was developed to further improve segmentation accuracy. By integrating SBSS, the dimension-reduction method (i.e. OSTA) and locally enhanced geometric features, a mean Intersection over Union (mIoU) of 76.6% and an Overall Accuracy (OA) of 93.8% were achieved on the Semantic3D (Reduced-8) benchmark. This set the state-of-the-art (SOTA) for the semantic segmentation accuracy of image-based methods and is very close to the SOTA accuracy of 3D method (i.e., 77.8% mIoU and 94.3% OA). Meanwhile, the integrated method took less than 10% of the processing time (52.64s versus 563.6s) of the fastest SOTA 3D method

    Demosaicing for RGBZ sensor

    No full text
    corecore