Search CORE

44 research outputs found

VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera

Author: Casas Dan
Mehta Dushyant
Rhodin Helge
Seidel Hans-Peter
Shafiei Mohammad
Sotnychenko Oleksandr
Sridhar Srinath
Theobalt Christian
Xu Weipeng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

We present the first real-time method to capture the full global 3D skeletal pose of a human in a stable, temporally consistent manner using a single RGB camera. Our method combines a new convolutional neural network (CNN) based pose regressor with kinematic skeleton fitting. Our novel fully-convolutional pose formulation regresses 2D and 3D joint positions jointly in real time and does not require tightly cropped input frames. A real-time kinematic skeleton fitting method uses the CNN output to yield temporally stable 3D global pose reconstructions on the basis of a coherent kinematic skeleton. This makes our approach the first monocular RGB method usable in real-time applications such as 3D character control---thus far, the only monocular methods for such applications employed specialized RGB-D cameras. Our method's accuracy is quantitatively on par with the best offline 3D monocular RGB pose estimation methods. Our results are qualitatively comparable to, and sometimes better than, results from monocular RGB-D approaches, such as the Kinect. However, we show that our approach is more broadly applicable than RGB-D solutions, i.e. it works for outdoor scenes, community videos, and low quality commodity RGB cameras.Comment: Accepted to SIGGRAPH 201

arXiv.org e-Print Archive

MPG.PuRe

Deep Autoencoder for Combined Human Pose Estimation and body Model Upscaling

Author: C Dong
C Ionescu
M Loper
M Sanzari
P Felzenszwalb
P Huang
S Abrahamsson
S Hochreiter
T Marcard von
U Schmidt
WT Freeman
Publication venue
Publication date: 04/07/2018
Field of study

We present a method for simultaneously estimating 3D human pose and body shape from a sparse set of wide-baseline camera views. We train a symmetric convolutional autoencoder with a dual loss that enforces learning of a latent representation that encodes skeletal joint positions, and at the same time learns a deep representation of volumetric body shape. We harness the latter to up-scale input volumetric data by a factor of

4 \times

, whilst recovering a 3D estimate of joint positions with equal or greater accuracy than the state of the art. Inference runs in real-time (25 fps) and has the potential for passive human behaviour monitoring where there is a requirement for high fidelity estimation of human body shape and pose

arXiv.org e-Print Archive

Crossref

University of Surrey

Surrey Research Insight

Learning 3D Human Pose from Structure and Motion

Author: A Newell
C Ionescu
C Sminchisescu
D Mehta
F Bogo
J Chen
L Herda
M Loper
MJ Park
N Sarafianos
R Urtasun
S Li
T Alldieck
V Ramakrishna
X Wei
X Zhou
Z-H Zhou
Publication venue
Publication date: 03/07/2018
Field of study

3D human pose estimation from a single image is a challenging problem, especially for in-the-wild settings due to the lack of 3D annotated data. We propose two anatomically inspired loss functions and use them with a weakly-supervised learning framework to jointly learn from large-scale in-the-wild 2D and indoor/synthetic 3D data. We also present a simple temporal network that exploits temporal and structural cues present in predicted pose sequences to temporally harmonize the pose estimations. We carefully analyze the proposed contributions through loss surface visualizations and sensitivity analysis to facilitate deeper understanding of their working mechanism. Our complete pipeline improves the state-of-the-art by 11.8% and 12% on Human3.6M and MPI-INF-3DHP, respectively, and runs at 30 FPS on a commodity graphics card.Comment: ECCV 2018. Project page: https://www.cse.iitb.ac.in/~rdabral/3DPose

arXiv.org e-Print Archive

Crossref