513 research outputs found
HeadOn: Real-time Reenactment of Human Portrait Videos
We propose HeadOn, the first real-time source-to-target reenactment approach
for complete human portrait videos that enables transfer of torso and head
motion, face expression, and eye gaze. Given a short RGB-D video of the target
actor, we automatically construct a personalized geometry proxy that embeds a
parametric head, eye, and kinematic torso model. A novel real-time reenactment
algorithm employs this proxy to photo-realistically map the captured motion
from the source actor to the target actor. On top of the coarse geometric
proxy, we propose a video-based rendering technique that composites the
modified target portrait video via view- and pose-dependent texturing, and
creates photo-realistic imagery of the target actor under novel torso and head
poses, facial expressions, and gaze directions. To this end, we propose a
robust tracking of the face and torso of the source actor. We extensively
evaluate our approach and show significant improvements in enabling much
greater flexibility in creating realistic reenacted output videos.Comment: Video: https://www.youtube.com/watch?v=7Dg49wv2c_g Presented at
Siggraph'1
TEMPO: Efficient Multi-View Pose Estimation, Tracking, and Forecasting
Existing volumetric methods for predicting 3D human pose estimation are
accurate, but computationally expensive and optimized for single time-step
prediction. We present TEMPO, an efficient multi-view pose estimation model
that learns a robust spatiotemporal representation, improving pose accuracy
while also tracking and forecasting human pose. We significantly reduce
computation compared to the state-of-the-art by recurrently computing
per-person 2D pose features, fusing both spatial and temporal information into
a single representation. In doing so, our model is able to use spatiotemporal
context to predict more accurate human poses without sacrificing efficiency. We
further use this representation to track human poses over time as well as
predict future poses. Finally, we demonstrate that our model is able to
generalize across datasets without scene-specific fine-tuning. TEMPO achieves
10 better MPJPE with a 33 improvement in FPS compared to TesseTrack
on the challenging CMU Panoptic Studio dataset.Comment: Accepted at ICCV 202
- …