17,478 research outputs found
Detect-and-Track: Efficient Pose Estimation in Videos
This paper addresses the problem of estimating and tracking human body
keypoints in complex, multi-person video. We propose an extremely lightweight
yet highly effective approach that builds upon the latest advancements in human
detection and video understanding. Our method operates in two-stages: keypoint
estimation in frames or short clips, followed by lightweight tracking to
generate keypoint predictions linked over the entire video. For frame-level
pose estimation we experiment with Mask R-CNN, as well as our own proposed 3D
extension of this model, which leverages temporal information over small clips
to generate more robust frame predictions. We conduct extensive ablative
experiments on the newly released multi-person video pose estimation benchmark,
PoseTrack, to validate various design choices of our model. Our approach
achieves an accuracy of 55.2% on the validation and 51.8% on the test set using
the Multi-Object Tracking Accuracy (MOTA) metric, and achieves state of the art
performance on the ICCV 2017 PoseTrack keypoint tracking challenge.Comment: In CVPR 2018. Ranked first in ICCV 2017 PoseTrack challenge (keypoint
tracking in videos). Code: https://github.com/facebookresearch/DetectAndTrack
and webpage: https://rohitgirdhar.github.io/DetectAndTrack
Plane-Based Optimization of Geometry and Texture for RGB-D Reconstruction of Indoor Scenes
We present a novel approach to reconstruct RGB-D indoor scene with plane
primitives. Our approach takes as input a RGB-D sequence and a dense coarse
mesh reconstructed by some 3D reconstruction method on the sequence, and
generate a lightweight, low-polygonal mesh with clear face textures and sharp
features without losing geometry details from the original scene. To achieve
this, we firstly partition the input mesh with plane primitives, simplify it
into a lightweight mesh next, then optimize plane parameters, camera poses and
texture colors to maximize the photometric consistency across frames, and
finally optimize mesh geometry to maximize consistency between geometry and
planes. Compared to existing planar reconstruction methods which only cover
large planar regions in the scene, our method builds the entire scene by
adaptive planes without losing geometry details and preserves sharp features in
the final mesh. We demonstrate the effectiveness of our approach by applying it
onto several RGB-D scans and comparing it to other state-of-the-art
reconstruction methods.Comment: in International Conference on 3D Vision 2018; Models and Code: see
https://github.com/chaowang15/plane-opt-rgbd. arXiv admin note: text overlap
with arXiv:1905.0885
- …