1,169 research outputs found
LiveCap: Real-time Human Performance Capture from Monocular Video
We present the first real-time human performance capture approach that
reconstructs dense, space-time coherent deforming geometry of entire humans in
general everyday clothing from just a single RGB video. We propose a novel
two-stage analysis-by-synthesis optimization whose formulation and
implementation are designed for high performance. In the first stage, a skinned
template model is jointly fitted to background subtracted input video, 2D and
3D skeleton joint positions found using a deep neural network, and a set of
sparse facial landmark detections. In the second stage, dense non-rigid 3D
deformations of skin and even loose apparel are captured based on a novel
real-time capable algorithm for non-rigid tracking using dense photometric and
silhouette constraints. Our novel energy formulation leverages automatically
identified material regions on the template to model the differing non-rigid
deformation behavior of skin and apparel. The two resulting non-linear
optimization problems per-frame are solved with specially-tailored
data-parallel Gauss-Newton solvers. In order to achieve real-time performance
of over 25Hz, we design a pipelined parallel architecture using the CPU and two
commodity GPUs. Our method is the first real-time monocular approach for
full-body performance capture. Our method yields comparable accuracy with
off-line performance capture techniques, while being orders of magnitude
faster
HeadOn: Real-time Reenactment of Human Portrait Videos
We propose HeadOn, the first real-time source-to-target reenactment approach
for complete human portrait videos that enables transfer of torso and head
motion, face expression, and eye gaze. Given a short RGB-D video of the target
actor, we automatically construct a personalized geometry proxy that embeds a
parametric head, eye, and kinematic torso model. A novel real-time reenactment
algorithm employs this proxy to photo-realistically map the captured motion
from the source actor to the target actor. On top of the coarse geometric
proxy, we propose a video-based rendering technique that composites the
modified target portrait video via view- and pose-dependent texturing, and
creates photo-realistic imagery of the target actor under novel torso and head
poses, facial expressions, and gaze directions. To this end, we propose a
robust tracking of the face and torso of the source actor. We extensively
evaluate our approach and show significant improvements in enabling much
greater flexibility in creating realistic reenacted output videos.Comment: Video: https://www.youtube.com/watch?v=7Dg49wv2c_g Presented at
Siggraph'1
DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data
Applying data-driven approaches to non-rigid 3D reconstruction has been difficult, which we believe can be attributed to the lack of a large-scale training corpus. One recent approach proposes self-supervision based on non-rigid reconstruction. Unfortunately, this method fails for important cases such as highly non-rigid deformations. We first address this problem of lack of data by introducing a novel semi-supervised strategy to obtain dense inter-frame correspondences from a sparse set of annotations. This way, we obtain a large dataset of 400 scenes, over 390,000 RGB-D frames, and 2,537 densely aligned frame pairs; in addition, we provide a test set along with several metrics for evaluation. Based on this corpus, we introduce a data-driven non-rigid feature matching approach, which we integrate into an optimization-based reconstruction pipeline. Here, we propose a new neural network that operates on RGB-D frames, while maintaining robustness under large non-rigid deformations and producing accurate predictions. Our approach significantly outperforms both existing non-rigid reconstruction methods that do not use learned data terms, as well as learning-based approaches that only use self-supervision
Investigating Cardiac Motion Patters Using Synthetic High-Resolution 3D Cardiovascular Magnetic Resonance Images and Statistical Shape Analysis
Diagnosis of ventricular dysfunction in congenital heart disease is more and more based on medical imaging, which allows investigation of abnormal cardiac morphology and correlated abnormal function. Although analysis of 2D images represents the clinical standard, novel tools performing automatic processing of 3D images are becoming available, providing more detailed and comprehensive information than simple 2D morphometry. Among these, statistical shape analysis (SSA) allows a consistent and quantitative description of a population of complex shapes, as a way to detect novel biomarkers, ultimately improving diagnosis and pathology understanding. The aim of this study is to describe the implementation of a SSA method for the investigation of 3D left ventricular shape and motion patterns and to test it on a small sample of 4 congenital repaired aortic stenosis patients and 4 age-matched healthy volunteers to demonstrate its potential. The advantage of this method is the capability of analyzing subject-specific motion patterns separately from the individual morphology, visually and quantitatively, as a way to identify functional abnormalities related to both dynamics and shape. Specifically, we combined 3D, high-resolution whole heart data with 2D, temporal information provided by cine cardiovascular magnetic resonance images, and we used an SSA approach to analyze 3D motion per se. Preliminary results of this pilot study showed that using this method, some differences in end-diastolic and end-systolic ventricular shapes could be captured, but it was not possible to clearly separate the two cohorts based on shape information alone. However, further analyses on ventricular motion allowed to qualitatively identify differences between the two populations. Moreover, by describing shape and motion with a small number of principal components, this method offers a fully automated process to obtain visually intuitive and numerical information on cardiac shape and motion, which could be, once validated on a larger sample size, easily integrated into the clinical workflow. To conclude, in this preliminary work, we have implemented state-of-the-art automatic segmentation and SSA methods, and we have shown how they could improve our understanding of ventricular kinetics by visually and potentially quantitatively highlighting aspects that are usually not picked up by traditional approaches
Capturing Hands in Action using Discriminative Salient Points and Physics Simulation
Hand motion capture is a popular research field, recently gaining more
attention due to the ubiquity of RGB-D sensors. However, even most recent
approaches focus on the case of a single isolated hand. In this work, we focus
on hands that interact with other hands or objects and present a framework that
successfully captures motion in such interaction scenarios for both rigid and
articulated objects. Our framework combines a generative model with
discriminatively trained salient points to achieve a low tracking error and
with collision detection and physics simulation to achieve physically plausible
estimates even in case of occlusions and missing visual data. Since all
components are unified in a single objective function which is almost
everywhere differentiable, it can be optimized with standard optimization
techniques. Our approach works for monocular RGB-D sequences as well as setups
with multiple synchronized RGB cameras. For a qualitative and quantitative
evaluation, we captured 29 sequences with a large variety of interactions and
up to 150 degrees of freedom.Comment: Accepted for publication by the International Journal of Computer
Vision (IJCV) on 16.02.2016 (submitted on 17.10.14). A combination into a
single framework of an ECCV'12 multicamera-RGB and a monocular-RGBD GCPR'14
hand tracking paper with several extensions, additional experiments and
detail
- …