77 research outputs found
Harvesting Multiple Views for Marker-less 3D Human Pose Annotations
Recent advances with Convolutional Networks (ConvNets) have shifted the
bottleneck for many computer vision tasks to annotated data collection. In this
paper, we present a geometry-driven approach to automatically collect
annotations for human pose prediction tasks. Starting from a generic ConvNet
for 2D human pose, and assuming a multi-view setup, we describe an automatic
way to collect accurate 3D human pose annotations. We capitalize on constraints
offered by the 3D geometry of the camera setup and the 3D structure of the
human body to probabilistically combine per view 2D ConvNet predictions into a
globally optimal 3D pose. This 3D pose is used as the basis for harvesting
annotations. The benefit of the annotations produced automatically with our
approach is demonstrated in two challenging settings: (i) fine-tuning a generic
ConvNet-based 2D pose predictor to capture the discriminative aspects of a
subject's appearance (i.e.,"personalization"), and (ii) training a ConvNet from
scratch for single view 3D human pose prediction without leveraging 3D pose
groundtruth. The proposed multi-view pose estimator achieves state-of-the-art
results on standard benchmarks, demonstrating the effectiveness of our method
in exploiting the available multi-view information.Comment: CVPR 2017 Camera Read
Learning 3D Human Pose from Structure and Motion
3D human pose estimation from a single image is a challenging problem,
especially for in-the-wild settings due to the lack of 3D annotated data. We
propose two anatomically inspired loss functions and use them with a
weakly-supervised learning framework to jointly learn from large-scale
in-the-wild 2D and indoor/synthetic 3D data. We also present a simple temporal
network that exploits temporal and structural cues present in predicted pose
sequences to temporally harmonize the pose estimations. We carefully analyze
the proposed contributions through loss surface visualizations and sensitivity
analysis to facilitate deeper understanding of their working mechanism. Our
complete pipeline improves the state-of-the-art by 11.8% and 12% on Human3.6M
and MPI-INF-3DHP, respectively, and runs at 30 FPS on a commodity graphics
card.Comment: ECCV 2018. Project page: https://www.cse.iitb.ac.in/~rdabral/3DPose
Integral Human Pose Regression
State-of-the-art human pose estimation methods are based on heat map
representation. In spite of the good performance, the representation has a few
issues in nature, such as not differentiable and quantization error. This work
shows that a simple integral operation relates and unifies the heat map
representation and joint regression, thus avoiding the above issues. It is
differentiable, efficient, and compatible with any heat map based methods. Its
effectiveness is convincingly validated via comprehensive ablation experiments
under various settings, specifically on 3D pose estimation, for the first time
- …