3,669 research outputs found
Visibility Constrained Generative Model for Depth-based 3D Facial Pose Tracking
In this paper, we propose a generative framework that unifies depth-based 3D
facial pose tracking and face model adaptation on-the-fly, in the unconstrained
scenarios with heavy occlusions and arbitrary facial expression variations.
Specifically, we introduce a statistical 3D morphable model that flexibly
describes the distribution of points on the surface of the face model, with an
efficient switchable online adaptation that gradually captures the identity of
the tracked subject and rapidly constructs a suitable face model when the
subject changes. Moreover, unlike prior art that employed ICP-based facial pose
estimation, to improve robustness to occlusions, we propose a ray visibility
constraint that regularizes the pose based on the face model's visibility with
respect to the input point cloud. Ablation studies and experimental results on
Biwi and ICT-3DHP datasets demonstrate that the proposed framework is effective
and outperforms completing state-of-the-art depth-based methods
2D-3D Pose Tracking with Multi-View Constraints
Camera localization in 3D LiDAR maps has gained increasing attention due to
its promising ability to handle complex scenarios, surpassing the limitations
of visual-only localization methods. However, existing methods mostly focus on
addressing the cross-modal gaps, estimating camera poses frame by frame without
considering the relationship between adjacent frames, which makes the pose
tracking unstable. To alleviate this, we propose to couple the 2D-3D
correspondences between adjacent frames using the 2D-2D feature matching,
establishing the multi-view geometrical constraints for simultaneously
estimating multiple camera poses. Specifically, we propose a new 2D-3D pose
tracking framework, which consists: a front-end hybrid flow estimation network
for consecutive frames and a back-end pose optimization module. We further
design a cross-modal consistency-based loss to incorporate the multi-view
constraints during the training and inference process. We evaluate our proposed
framework on the KITTI and Argoverse datasets. Experimental results demonstrate
its superior performance compared to existing frame-by-frame 2D-3D pose
tracking methods and state-of-the-art vision-only pose tracking algorithms.
More online pose tracking videos are available at
\url{https://youtu.be/yfBRdg7gw5M}Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
- …