8 research outputs found

    SPU-Net: Self-Supervised Point Cloud Upsampling by Coarse-to-Fine Reconstruction with Self-Projection Optimization

    Full text link
    The task of point cloud upsampling aims to acquire dense and uniform point sets from sparse and irregular point sets. Although significant progress has been made with deep learning models, they require ground-truth dense point sets as the supervision information, which can only trained on synthetic paired training data and are not suitable for training under real-scanned sparse data. However, it is expensive and tedious to obtain large scale paired sparse-dense point sets for training from real scanned sparse data. To address this problem, we propose a self-supervised point cloud upsampling network, named SPU-Net, to capture the inherent upsampling patterns of points lying on the underlying object surface. Specifically, we propose a coarse-to-fine reconstruction framework, which contains two main components: point feature extraction and point feature expansion, respectively. In the point feature extraction, we integrate self-attention module with graph convolution network (GCN) to simultaneously capture context information inside and among local regions. In the point feature expansion, we introduce a hierarchically learnable folding strategy to generate the upsampled point sets with learnable 2D grids. Moreover, to further optimize the noisy points in the generated point sets, we propose a novel self-projection optimization associated with uniform and reconstruction terms, as a joint loss, to facilitate the self-supervised point cloud upsampling. We conduct various experiments on both synthetic and real-scanned datasets, and the results demonstrate that we achieve comparable performance to the state-of-the-art supervised methods

    Part2Word: Learning Joint Embedding of Point Clouds and Text by Matching Parts to Words

    Full text link
    It is important to learn joint embedding for 3D shapes and text in different shape understanding tasks, such as shape-text matching, retrieval, and shape captioning. Current multi-view based methods learn a mapping from multiple rendered views to text. However, these methods can not analyze 3D shapes well due to the self-occlusion and limitation of learning manifolds. To resolve this issue, we propose a method to learn joint embedding of point clouds and text by matching parts from shapes to words from sentences in a common space. Specifically, we first learn segmentation prior to segment point clouds into parts. Then, we map parts and words into an optimized space, where the parts and words can be matched with each other. In the optimized space, we represent a part by aggregating features of all points within the part, while representing each word with its context information, where we train our network to minimize the triplet ranking loss. Moreover, we also introduce cross-modal attention to capture the relationship of part-word in this matching procedure, which enhances joint embedding learning. Our experimental results outperform the state-of-the-art in multi-modal retrieval under the widely used benchmark

    Unsupervised Point Cloud Pre-Training via Occlusion Completion

    Get PDF
    We describe a simple pre-training approach for point clouds. It works in three steps: 1. Mask all points occluded in a camera view; 2. Learn an encoder-decoder model to reconstruct the occluded points; 3. Use the encoder weights as initialisation for downstream point cloud tasks. We find that even when we construct a single pre-training dataset (from ModelNet40), this pre-training method improves accuracy across different datasets and encoders, on a wide range of downstream tasks. Specifically, we show that our method outperforms previous pre-training methods in object classification, and both part-based and semantic segmentation tasks. We study the pre-trained features and find that they lead to wide downstream minima, have high transformation invariance, and have activations that are highly correlated with part labels. Code and data are available at: https://github.com/hansen7/OcCoComment: sync with ICCV camera read

    Snowflake Point Deconvolution for Point Cloud Completion and Generation with Skip-Transformer

    Full text link
    Most existing point cloud completion methods suffer from the discrete nature of point clouds and the unstructured prediction of points in local regions, which makes it difficult to reveal fine local geometric details. To resolve this issue, we propose SnowflakeNet with snowflake point deconvolution (SPD) to generate complete point clouds. SPD models the generation of point clouds as the snowflake-like growth of points, where child points are generated progressively by splitting their parent points after each SPD. Our insight into the detailed geometry is to introduce a skip-transformer in the SPD to learn the point splitting patterns that can best fit the local regions. The skip-transformer leverages attention mechanism to summarize the splitting patterns used in the previous SPD layer to produce the splitting in the current layer. The locally compact and structured point clouds generated by SPD precisely reveal the structural characteristics of the 3D shape in local patches, which enables us to predict highly detailed geometries. Moreover, since SPD is a general operation that is not limited to completion, we explore its applications in other generative tasks, including point cloud auto-encoding, generation, single image reconstruction, and upsampling. Our experimental results outperform state-of-the-art methods under widely used benchmarks.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022. This work is a journal extension of our ICCV 2021 paper arXiv:2108.04444 . The first two authors contributed equall

    Deep Learning-Based Point Upsampling for Edge Enhancement of 3D-Scanned Data and Its Application to Transparent Visualization

    Get PDF
    Large-scale 3D-scanned point clouds enable the accurate and easy recording of complex 3D objects in the real world. The acquired point clouds often describe both the surficial and internal 3D structure of the scanned objects. The recently proposed edge-highlighted transparent visualization method is effective for recognizing the whole 3D structure of such point clouds. This visualization utilizes the degree of opacity for highlighting edges of the 3D-scanned objects, and it realizes clear transparent viewing of the entire 3D structures. However, for 3D-scanned point clouds, the quality of any edge-highlighting visualization depends on the distribution of the extracted edge points. Insufficient density, sparseness, or partial defects in the edge points can lead to unclear edge visualization. Therefore, in this paper, we propose a deep learning-based upsampling method focusing on the edge regions of 3D-scanned point clouds to generate more edge points during the 3D-edge upsampling task. The proposed upsampling network dramatically improves the point-distributional density, uniformity, and connectivity in the edge regions. The results on synthetic and scanned edge data show that our method can improve the percentage of edge points more than 15% compared to the existing point cloud upsampling network. Our upsampling network works well for both sharp and soft edges. A combined use with a noise-eliminating filter also works well. We demonstrate the effectiveness of our upsampling network by applying it to various real 3D-scanned point clouds. We also prove that the improved edge point distribution can improve the visibility of the edge-highlighted transparent visualization of complex 3D-scanned objects
    corecore