Search CORE

539 research outputs found

Learning to Reconstruct People in Clothing from a Single RGB Camera

Author: Alldieck T.
Bhatnagar B.
Magnor M.
Pons-Moll G.
Theobalt C.
Publication venue
Publication date: 01/01/2019
Field of study

We present a learning-based model to infer the personalized 3D shape of people from a few frames (1-8) of a monocular video in which the person is moving, in less than 10 seconds with a reconstruction accuracy of 5mm. Our model learns to predict the parameters of a statistical body model and instance displacements that add clothing and hair to the shape. The model achieves fast and accurate predictions based on two key design choices. First, by predicting shape in a canonical T-pose space, the network learns to encode the images of the person into pose-invariant latent codes, where the information is fused. Second, based on the observation that feed-forward predictions are fast but do not always align with the input images, we predict using both, bottom-up and top-down streams (one per view) allowing information to flow in both directions. Learning relies only on synthetic 3D data. Once learned, the model can take a variable number of frames as input, and is able to reconstruct shapes even from a single image with an accuracy of 6mm. Results on 3 different datasets demonstrate the efficacy and accuracy of our approach

MPG.PuRe

MonoPerfCap: Human Performance Capture from Monocular Video

Author: Chatterjee Avishek
Mehta Dushyant
Rhodin Helge
Seidel Hans-Peter
Theobalt Christian
Xu Weipeng
Zollhöfer Michael
Publication venue
Publication date: 01/01/2018
Field of study

We present the first marker-less approach for temporally coherent 3D performance capture of a human with general clothing from monocular video. Our approach reconstructs articulated human skeleton motion as well as medium-scale non-rigid surface deformations in general scenes. Human performance capture is a challenging problem due to the large range of articulation, potentially fast motion, and considerable non-rigid deformations, even from multi-view data. Reconstruction from monocular video alone is drastically more challenging, since strong occlusions and the inherent depth ambiguity lead to a highly ill-posed reconstruction problem. We tackle these challenges by a novel approach that employs sparse 2D and 3D human pose detections from a convolutional neural network using a batch-based pose estimation strategy. Joint recovery of per-batch motion allows to resolve the ambiguities of the monocular reconstruction problem based on a low dimensional trajectory subspace. In addition, we propose refinement of the surface geometry based on fully automatically extracted silhouettes to enable medium-scale non-rigid alignment. We demonstrate state-of-the-art performance capture results that enable exciting applications such as video editing and free viewpoint video, previously infeasible from monocular video. Our qualitative and quantitative evaluation demonstrates that our approach significantly outperforms previous monocular methods in terms of accuracy, robustness and scene complexity that can be handled.Comment: Accepted to ACM TOG 2018, to be presented on SIGGRAPH 201

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

MPG.PuRe

LiveCap: Real-time Human Performance Capture from Monocular Video

Author: Habermann Marc
Pons-Moll Gerard
Theobalt Christian
Xu Weipeng
Zollhoefer Michael
Publication venue
Publication date: 01/01/2019
Field of study

We present the first real-time human performance capture approach that reconstructs dense, space-time coherent deforming geometry of entire humans in general everyday clothing from just a single RGB video. We propose a novel two-stage analysis-by-synthesis optimization whose formulation and implementation are designed for high performance. In the first stage, a skinned template model is jointly fitted to background subtracted input video, 2D and 3D skeleton joint positions found using a deep neural network, and a set of sparse facial landmark detections. In the second stage, dense non-rigid 3D deformations of skin and even loose apparel are captured based on a novel real-time capable algorithm for non-rigid tracking using dense photometric and silhouette constraints. Our novel energy formulation leverages automatically identified material regions on the template to model the differing non-rigid deformation behavior of skin and apparel. The two resulting non-linear optimization problems per-frame are solved with specially-tailored data-parallel Gauss-Newton solvers. In order to achieve real-time performance of over 25Hz, we design a pipelined parallel architecture using the CPU and two commodity GPUs. Our method is the first real-time monocular approach for full-body performance capture. Our method yields comparable accuracy with off-line performance capture techniques, while being orders of magnitude faster

arXiv.org e-Print Archive

MPG.PuRe

Video Based Reconstruction of 3D People Models

Author: Alldieck Thiemo
Magnor Marcus
Pons-Moll Gerard
Theobalt Christian
Xu Weipeng
Publication venue
Publication date: 01/01/2018
Field of study

This paper describes how to obtain accurate 3D body models and texture of arbitrary people from a single, monocular video in which a person is moving. Based on a parametric body model, we present a robust processing pipeline achieving 3D model fits with 5mm accuracy also for clothed people. Our main contribution is a method to nonrigidly deform the silhouette cones corresponding to the dynamic human silhouettes, resulting in a visual hull in a common reference frame that enables surface reconstruction. This enables efficient estimation of a consensus 3D shape, texture and implanted animation skeleton based on a large number of frames. We present evaluation results for a number of test subjects and analyze overall performance. Requiring only a smartphone or webcam, our method enables everyone to create their own fully animatable digital double, e.g., for social VR applications or virtual try-on for online fashion shopping.Comment: CVPR 2018 Spotlight, IEEE Conference on Computer Vision and Pattern Recognition 2018 (CVPR

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Tex2Shape: Detailed Full Human Body Geometry From a Single Image

Author: Alldieck T.
Magnor M.
Pons-Moll G.
Theobalt C.
Publication venue
Publication date: 01/01/2019
Field of study

We present a simple yet effective method to infer detailed full human body shape from only a single photograph. Our model can infer full-body shape including face, hair, and clothing including wrinkles at interactive frame-rates. Results feature details even on parts that are occluded in the input image. Our main idea is to turn shape regression into an aligned image-to-image translation problem. The input to our method is a partial texture map of the visible region obtained from off-the-shelf methods. From a partial texture, we estimate detailed normal and vector displacement maps, which can be applied to a low-resolution smooth body model to add detail and clothing. Despite being trained purely with synthetic data, our model generalizes well to real-world photographs. Numerous results demonstrate the versatility and robustness of our method

MPG.PuRe

Tex2Shape: Detailed Full Human Body Geometry From a Single Image

Author: Alldieck Thiemo
Magnor Marcus
Pons-Moll Gerard
Theobalt Christian
Publication venue
Publication date: 01/01/2019
Field of study

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Learning to Reconstruct People in Clothing from a Single RGB Camera

Author: Alldieck Thiemo
Bhatnagar Bharat Lal
Magnor Marcus
Pons-Moll Gerard
Theobalt Christian
Publication venue
Publication date: 01/01/2019
Field of study

arXiv.org e-Print Archive

Crossref

MPG.PuRe