311 research outputs found
DeepICP: An End-to-End Deep Neural Network for 3D Point Cloud Registration
We present DeepICP - a novel end-to-end learning-based 3D point cloud
registration framework that achieves comparable registration accuracy to prior
state-of-the-art geometric methods. Different from other keypoint based methods
where a RANSAC procedure is usually needed, we implement the use of various
deep neural network structures to establish an end-to-end trainable network.
Our keypoint detector is trained through this end-to-end structure and enables
the system to avoid the inference of dynamic objects, leverages the help of
sufficiently salient features on stationary objects, and as a result, achieves
high robustness. Rather than searching the corresponding points among existing
points, the key contribution is that we innovatively generate them based on
learned matching probabilities among a group of candidates, which can boost the
registration accuracy. Our loss function incorporates both the local similarity
and the global geometric constraints to ensure all above network designs can
converge towards the right direction. We comprehensively validate the
effectiveness of our approach using both the KITTI dataset and the
Apollo-SouthBay dataset. Results demonstrate that our method achieves
comparable or better performance than the state-of-the-art geometry-based
methods. Detailed ablation and visualization analysis are included to further
illustrate the behavior and insights of our network. The low registration error
and high robustness of our method makes it attractive for substantial
applications relying on the point cloud registration task.Comment: 10 pages, 6 figures, 3 tables, typos corrected, experimental results
updated, accepted by ICCV 201
Precision Enhancement of 3D Surfaces from Multiple Compressed Depth Maps
In texture-plus-depth representation of a 3D scene, depth maps from different
camera viewpoints are typically lossily compressed via the classical transform
coding / coefficient quantization paradigm. In this paper we propose to reduce
distortion of the decoded depth maps due to quantization. The key observation
is that depth maps from different viewpoints constitute multiple descriptions
(MD) of the same 3D scene. Considering the MD jointly, we perform a POCS-like
iterative procedure to project a reconstructed signal from one depth map to the
other and back, so that the converged depth maps have higher precision than the
original quantized versions.Comment: This work was accepted as ongoing work paper in IEEE MMSP'201
Heavy-quark potential in Gribov-Zwanziger approach around deconfinement phase transition
The interaction potential between a pair of heavy quarks is calculated with
resummed perturbation method in Gribov-Zwanziger approach at finite
temperature. The resummed loop correction makes the potential complex. While
the real part is, as expected, screened and becomes short-ranged in hot medium,
the strength of the imaginary part increases with temperature and is comparable
with the real part, which is very different from the previous calculation in
HTL approach. This means that, both the color screening and Landau damping play
important role in the dissociation of heavy flavor hadrons in hot medium.Comment: 6 pages, 7 figure
Spin-group symmetry in magnetic materials with negligible spin-orbit coupling
Symmetry formulated by group theory plays an essential role with respect to
the laws of nature, from fundamental particles to condensed matter systems.
Here, by combining symmetry analysis and tight-binding model calculations, we
elucidate that the crystallographic symmetries of a vast number of magnetic
materials with light elements, in which the neglect of relativistic spin-orbit
coupling (SOC) is an appropriate approximation, are considerably larger than
the conventional magnetic groups. Thus, a symmetry description that involves
partially-decoupled spin and spatial rotations, dubbed as spin group, is
required. Spin group permits more symmetry operations and thus more energy
degeneracies that are disallowed by the magnetic groups. One consequence of the
spin group is the new anti-unitary symmetries that protect SOC-free Z_2
topological phases with unprecedented surface node structures. Our work not
only manifests the physical reality of materials with weak SOC, but also shed
light on the understanding of all solids with and without SOC by a unified
group theory.Comment: To appear in Phys. Rev. X; main text includes 34 pages, 1 table and 5
figure
Multi-Modal Face Stylization with a Generative Prior
In this work, we introduce a new approach for artistic face stylization.
Despite existing methods achieving impressive results in this task, there is
still room for improvement in generating high-quality stylized faces with
diverse styles and accurate facial reconstruction. Our proposed framework,
MMFS, supports multi-modal face stylization by leveraging the strengths of
StyleGAN and integrates it into an encoder-decoder architecture. Specifically,
we use the mid-resolution and high-resolution layers of StyleGAN as the decoder
to generate high-quality faces, while aligning its low-resolution layer with
the encoder to extract and preserve input facial details. We also introduce a
two-stage training strategy, where we train the encoder in the first stage to
align the feature maps with StyleGAN and enable a faithful reconstruction of
input faces. In the second stage, the entire network is fine-tuned with
artistic data for stylized face generation. To enable the fine-tuned model to
be applied in zero-shot and one-shot stylization tasks, we train an additional
mapping network from the large-scale Contrastive-Language-Image-Pre-training
(CLIP) space to a latent space of fine-tuned StyleGAN. Qualitative and
quantitative experiments show that our framework achieves superior face
stylization performance in both one-shot and zero-shot stylization tasks,
outperforming state-of-the-art methods by a large margin
DVIS: Decoupled Video Instance Segmentation Framework
Video instance segmentation (VIS) is a critical task with diverse
applications, including autonomous driving and video editing. Existing methods
often underperform on complex and long videos in real world, primarily due to
two factors. Firstly, offline methods are limited by the tightly-coupled
modeling paradigm, which treats all frames equally and disregards the
interdependencies between adjacent frames. Consequently, this leads to the
introduction of excessive noise during long-term temporal alignment. Secondly,
online methods suffer from inadequate utilization of temporal information. To
tackle these challenges, we propose a decoupling strategy for VIS by dividing
it into three independent sub-tasks: segmentation, tracking, and refinement.
The efficacy of the decoupling strategy relies on two crucial elements: 1)
attaining precise long-term alignment outcomes via frame-by-frame association
during tracking, and 2) the effective utilization of temporal information
predicated on the aforementioned accurate alignment outcomes during refinement.
We introduce a novel referring tracker and temporal refiner to construct the
\textbf{D}ecoupled \textbf{VIS} framework (\textbf{DVIS}). DVIS achieves new
SOTA performance in both VIS and VPS, surpassing the current SOTA methods by
7.3 AP and 9.6 VPQ on the OVIS and VIPSeg datasets, which are the most
challenging and realistic benchmarks. Moreover, thanks to the decoupling
strategy, the referring tracker and temporal refiner are super light-weight
(only 1.69\% of the segmenter FLOPs), allowing for efficient training and
inference on a single GPU with 11G memory. The code is available at
\href{https://github.com/zhang-tao-whu/DVIS}{https://github.com/zhang-tao-whu/DVIS}
Snowflake Point Deconvolution for Point Cloud Completion and Generation with Skip-Transformer
Most existing point cloud completion methods suffer from the discrete nature
of point clouds and the unstructured prediction of points in local regions,
which makes it difficult to reveal fine local geometric details. To resolve
this issue, we propose SnowflakeNet with snowflake point deconvolution (SPD) to
generate complete point clouds. SPD models the generation of point clouds as
the snowflake-like growth of points, where child points are generated
progressively by splitting their parent points after each SPD. Our insight into
the detailed geometry is to introduce a skip-transformer in the SPD to learn
the point splitting patterns that can best fit the local regions. The
skip-transformer leverages attention mechanism to summarize the splitting
patterns used in the previous SPD layer to produce the splitting in the current
layer. The locally compact and structured point clouds generated by SPD
precisely reveal the structural characteristics of the 3D shape in local
patches, which enables us to predict highly detailed geometries. Moreover,
since SPD is a general operation that is not limited to completion, we explore
its applications in other generative tasks, including point cloud
auto-encoding, generation, single image reconstruction, and upsampling. Our
experimental results outperform state-of-the-art methods under widely used
benchmarks.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI), 2022. This work is a journal extension of our ICCV 2021 paper
arXiv:2108.04444 . The first two authors contributed equall
- …