869 research outputs found
Snowflake Point Deconvolution for Point Cloud Completion and Generation with Skip-Transformer
Most existing point cloud completion methods suffer from the discrete nature
of point clouds and the unstructured prediction of points in local regions,
which makes it difficult to reveal fine local geometric details. To resolve
this issue, we propose SnowflakeNet with snowflake point deconvolution (SPD) to
generate complete point clouds. SPD models the generation of point clouds as
the snowflake-like growth of points, where child points are generated
progressively by splitting their parent points after each SPD. Our insight into
the detailed geometry is to introduce a skip-transformer in the SPD to learn
the point splitting patterns that can best fit the local regions. The
skip-transformer leverages attention mechanism to summarize the splitting
patterns used in the previous SPD layer to produce the splitting in the current
layer. The locally compact and structured point clouds generated by SPD
precisely reveal the structural characteristics of the 3D shape in local
patches, which enables us to predict highly detailed geometries. Moreover,
since SPD is a general operation that is not limited to completion, we explore
its applications in other generative tasks, including point cloud
auto-encoding, generation, single image reconstruction, and upsampling. Our
experimental results outperform state-of-the-art methods under widely used
benchmarks.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI), 2022. This work is a journal extension of our ICCV 2021 paper
arXiv:2108.04444 . The first two authors contributed equall
FBNet: Feedback Network for Point Cloud Completion
The rapid development of point cloud learning has driven point cloud
completion into a new era. However, the information flows of most existing
completion methods are solely feedforward, and high-level information is rarely
reused to improve low-level feature learning. To this end, we propose a novel
Feedback Network (FBNet) for point cloud completion, in which present features
are efficiently refined by rerouting subsequent fine-grained ones. Firstly,
partial inputs are fed to a Hierarchical Graph-based Network (HGNet) to
generate coarse shapes. Then, we cascade several Feedback-Aware Completion
(FBAC) Blocks and unfold them across time recurrently. Feedback connections
between two adjacent time steps exploit fine-grained features to improve
present shape generations. The main challenge of building feedback connections
is the dimension mismatching between present and subsequent features. To
address this, the elaborately designed point Cross Transformer exploits
efficient information from feedback features via cross attention strategy and
then refines present features with the enhanced feedback features. Quantitative
and qualitative experiments on several datasets demonstrate the superiority of
proposed FBNet compared to state-of-the-art methods on point completion task.Comment: The first two authors contributed equally to this work. The source
code and model are available at
https://github.com/hikvision-research/3DVision/. Accepted to ECCV 2022 as
oral presentatio
SPU-Net: Self-Supervised Point Cloud Upsampling by Coarse-to-Fine Reconstruction with Self-Projection Optimization
The task of point cloud upsampling aims to acquire dense and uniform point
sets from sparse and irregular point sets. Although significant progress has
been made with deep learning models, they require ground-truth dense point sets
as the supervision information, which can only trained on synthetic paired
training data and are not suitable for training under real-scanned sparse data.
However, it is expensive and tedious to obtain large scale paired sparse-dense
point sets for training from real scanned sparse data. To address this problem,
we propose a self-supervised point cloud upsampling network, named SPU-Net, to
capture the inherent upsampling patterns of points lying on the underlying
object surface. Specifically, we propose a coarse-to-fine reconstruction
framework, which contains two main components: point feature extraction and
point feature expansion, respectively. In the point feature extraction, we
integrate self-attention module with graph convolution network (GCN) to
simultaneously capture context information inside and among local regions. In
the point feature expansion, we introduce a hierarchically learnable folding
strategy to generate the upsampled point sets with learnable 2D grids.
Moreover, to further optimize the noisy points in the generated point sets, we
propose a novel self-projection optimization associated with uniform and
reconstruction terms, as a joint loss, to facilitate the self-supervised point
cloud upsampling. We conduct various experiments on both synthetic and
real-scanned datasets, and the results demonstrate that we achieve comparable
performance to the state-of-the-art supervised methods
MFM-Net: Unpaired Shape Completion Network with Multi-stage Feature Matching
Unpaired 3D object completion aims to predict a complete 3D shape from an
incomplete input without knowing the correspondence between the complete and
incomplete shapes during training. To build the correspondence between two data
modalities, previous methods usually apply adversarial training to match the
global shape features extracted by the encoder. However, this ignores the
correspondence between multi-scaled geometric information embedded in the
pyramidal hierarchy of the decoder, which makes previous methods struggle to
generate high-quality complete shapes. To address this problem, we propose a
novel unpaired shape completion network, named MFM-Net, using multi-stage
feature matching, which decomposes the learning of geometric correspondence
into multi-stages throughout the hierarchical generation process in the point
cloud decoder. Specifically, MFM-Net adopts a dual path architecture to
establish multiple feature matching channels in different layers of the
decoder, which is then combined with the adversarial learning to merge the
distribution of features from complete and incomplete modalities. In addition,
a refinement is applied to enhance the details. As a result, MFM-Net makes use
of a more comprehensive understanding to establish the geometric correspondence
between complete and incomplete shapes in a local-to-global perspective, which
enables more detailed geometric inference for generating high-quality complete
shapes. We conduct comprehensive experiments on several datasets, and the
results show that our method outperforms previous methods of unpaired point
cloud completion with a large margin
L2G Auto-encoder: Understanding Point Clouds by Local-to-Global Reconstruction with Hierarchical Self-Attention
Auto-encoder is an important architecture to understand point clouds in an
encoding and decoding procedure of self reconstruction. Current auto-encoder
mainly focuses on the learning of global structure by global shape
reconstruction, while ignoring the learning of local structures. To resolve
this issue, we propose Local-to-Global auto-encoder (L2G-AE) to simultaneously
learn the local and global structure of point clouds by local to global
reconstruction. Specifically, L2G-AE employs an encoder to encode the geometry
information of multiple scales in a local region at the same time. In addition,
we introduce a novel hierarchical self-attention mechanism to highlight the
important points, scales and regions at different levels in the information
aggregation of the encoder. Simultaneously, L2G-AE employs a recurrent neural
network (RNN) as decoder to reconstruct a sequence of scales in a local region,
based on which the global point cloud is incrementally reconstructed. Our
outperforming results in shape classification, retrieval and upsampling show
that L2G-AE can understand point clouds better than state-of-the-art methods
CasFusionNet: A Cascaded Network for Point Cloud Semantic Scene Completion by Dense Feature Fusion
Semantic scene completion (SSC) aims to complete a partial 3D scene and
predict its semantics simultaneously. Most existing works adopt the voxel
representations, thus suffering from the growth of memory and computation cost
as the voxel resolution increases. Though a few works attempt to solve SSC from
the perspective of 3D point clouds, they have not fully exploited the
correlation and complementarity between the two tasks of scene completion and
semantic segmentation. In our work, we present CasFusionNet, a novel cascaded
network for point cloud semantic scene completion by dense feature fusion.
Specifically, we design (i) a global completion module (GCM) to produce an
upsampled and completed but coarse point set, (ii) a semantic segmentation
module (SSM) to predict the per-point semantic labels of the completed points
generated by GCM, and (iii) a local refinement module (LRM) to further refine
the coarse completed points and the associated labels from a local perspective.
We organize the above three modules via dense feature fusion in each level, and
cascade a total of four levels, where we also employ feature fusion between
each level for sufficient information usage. Both quantitative and qualitative
results on our compiled two point-based datasets validate the effectiveness and
superiority of our CasFusionNet compared to state-of-the-art methods in terms
of both scene completion and semantic segmentation. The codes and datasets are
available at: https://github.com/JinfengX/CasFusionNet
Shape Completion with Points in the Shadow
Single-view point cloud completion aims to recover the full geometry of an
object based on only limited observation, which is extremely hard due to the
data sparsity and occlusion. The core challenge is to generate plausible
geometries to fill the unobserved part of the object based on a partial scan,
which is under-constrained and suffers from a huge solution space. Inspired by
the classic shadow volume technique in computer graphics, we propose a new
method to reduce the solution space effectively. Our method considers the
camera a light source that casts rays toward the object. Such light rays build
a reasonably constrained but sufficiently expressive basis for completion. The
completion process is then formulated as a point displacement optimization
problem. Points are initialized at the partial scan and then moved to their
goal locations with two types of movements for each point: directional
movements along the light rays and constrained local movement for shape
refinement. We design neural networks to predict the ideal point movements to
get the completion results. We demonstrate that our method is accurate, robust,
and generalizable through exhaustive evaluation and comparison. Moreover, it
outperforms state-of-the-art methods qualitatively and quantitatively on MVP
datasets.Comment: SIGGRAPH Aisa 2022 Conference Pape
SVDFormer: Complementing Point Cloud via Self-view Augmentation and Self-structure Dual-generator
In this paper, we propose a novel network, SVDFormer, to tackle two specific
challenges in point cloud completion: understanding faithful global shapes from
incomplete point clouds and generating high-accuracy local structures. Current
methods either perceive shape patterns using only 3D coordinates or import
extra images with well-calibrated intrinsic parameters to guide the geometry
estimation of the missing parts. However, these approaches do not always fully
leverage the cross-modal self-structures available for accurate and
high-quality point cloud completion. To this end, we first design a Self-view
Fusion Network that leverages multiple-view depth image information to observe
incomplete self-shape and generate a compact global shape. To reveal highly
detailed structures, we then introduce a refinement module, called
Self-structure Dual-generator, in which we incorporate learned shape priors and
geometric self-similarities for producing new points. By perceiving the
incompleteness of each point, the dual-path design disentangles refinement
strategies conditioned on the structural type of each point. SVDFormer absorbs
the wisdom of self-structures, avoiding any additional paired information such
as color images with precisely calibrated camera intrinsic parameters.
Comprehensive experiments indicate that our method achieves state-of-the-art
performance on widely-used benchmarks. Code will be available at
https://github.com/czvvd/SVDFormer.Comment: Accepted by ICCV 202
- …