Search CORE

12,303 research outputs found

HeadOn: Real-time Reenactment of Human Portrait Videos

Author: Nießner Matthias
Stamminger Marc
Theobalt Christian
Thies Justus
Zollhöfer Michael
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

We propose HeadOn, the first real-time source-to-target reenactment approach for complete human portrait videos that enables transfer of torso and head motion, face expression, and eye gaze. Given a short RGB-D video of the target actor, we automatically construct a personalized geometry proxy that embeds a parametric head, eye, and kinematic torso model. A novel real-time reenactment algorithm employs this proxy to photo-realistically map the captured motion from the source actor to the target actor. On top of the coarse geometric proxy, we propose a video-based rendering technique that composites the modified target portrait video via view- and pose-dependent texturing, and creates photo-realistic imagery of the target actor under novel torso and head poses, facial expressions, and gaze directions. To this end, we propose a robust tracking of the face and torso of the source actor. We extensively evaluate our approach and show significant improvements in enabling much greater flexibility in creating realistic reenacted output videos.Comment: Video: https://www.youtube.com/watch?v=7Dg49wv2c_g Presented at Siggraph'1

arXiv.org e-Print Archive

MPG.PuRe

RGB-D datasets using microsoft kinect or similar sensors: a survey

Author: Galili
Guan
Hu
Kolner
Mulvad
Nakazawa
Palushani
Palushani
Publication venue: Springer
Publication date: 01/01/2015
Field of study

RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms

Northumbria Research Link

Crossref

Springer - Publisher Connector

Online Research Database In Technology

Cascaded 3D Full-body Pose Regression from Single Depth Image at 100 FPS

Author: Su Le
Xia Shihong
Zhang Zihao
Publication venue
Publication date: 25/01/2018
Field of study

There are increasing real-time live applications in virtual reality, where it plays an important role in capturing and retargetting 3D human pose. But it is still challenging to estimate accurate 3D pose from consumer imaging devices such as depth camera. This paper presents a novel cascaded 3D full-body pose regression method to estimate accurate pose from a single depth image at 100 fps. The key idea is to train cascaded regressors based on Gradient Boosting algorithm from pre-recorded human motion capture database. By incorporating hierarchical kinematics model of human pose into the learning procedure, we can directly estimate accurate 3D joint angles instead of joint positions. The biggest advantage of this model is that the bone length can be preserved during the whole 3D pose estimation procedure, which leads to more effective features and higher pose estimation accuracy. Our method can be used as an initialization procedure when combining with tracking methods. We demonstrate the power of our method on a wide range of synthesized human motion data from CMU mocap database, Human3.6M dataset and real human movements data captured in real time. In our comparison against previous 3D pose estimation methods and commercial system such as Kinect 2017, we achieve the state-of-the-art accuracy

arXiv.org e-Print Archive

Crossref

LiveCap: Real-time Human Performance Capture from Monocular Video

Author: Habermann Marc
Pons-Moll Gerard
Theobalt Christian
Xu Weipeng
Zollhoefer Michael
Publication venue
Publication date: 01/01/2019
Field of study

We present the first real-time human performance capture approach that reconstructs dense, space-time coherent deforming geometry of entire humans in general everyday clothing from just a single RGB video. We propose a novel two-stage analysis-by-synthesis optimization whose formulation and implementation are designed for high performance. In the first stage, a skinned template model is jointly fitted to background subtracted input video, 2D and 3D skeleton joint positions found using a deep neural network, and a set of sparse facial landmark detections. In the second stage, dense non-rigid 3D deformations of skin and even loose apparel are captured based on a novel real-time capable algorithm for non-rigid tracking using dense photometric and silhouette constraints. Our novel energy formulation leverages automatically identified material regions on the template to model the differing non-rigid deformation behavior of skin and apparel. The two resulting non-linear optimization problems per-frame are solved with specially-tailored data-parallel Gauss-Newton solvers. In order to achieve real-time performance of over 25Hz, we design a pipelined parallel architecture using the CPU and two commodity GPUs. Our method is the first real-time monocular approach for full-body performance capture. Our method yields comparable accuracy with off-line performance capture techniques, while being orders of magnitude faster

arXiv.org e-Print Archive

MPG.PuRe

Unobtrusive and pervasive video-based eye-gaze tracking

Author: Camilleri Kenneth P.
Cristina Stefania
Publication venue: 'Elsevier BV'
Publication date: 01/05/2018
Field of study

Eye-gaze tracking has long been considered a desktop technology that finds its use inside the traditional office setting, where the operating conditions may be controlled. Nonetheless, recent advancements in mobile technology and a growing interest in capturing natural human behaviour have motivated an emerging interest in tracking eye movements within unconstrained real-life conditions, referred to as pervasive eye-gaze tracking. This critical review focuses on emerging passive and unobtrusive video-based eye-gaze tracking methods in recent literature, with the aim to identify different research avenues that are being followed in response to the challenges of pervasive eye-gaze tracking. Different eye-gaze tracking approaches are discussed in order to bring out their strengths and weaknesses, and to identify any limitations, within the context of pervasive eye-gaze tracking, that have yet to be considered by the computer vision community.peer-reviewe

OAR@UM

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image

Author: Black Michael J.
Bolkart Timo
Choutas Vasileios
Ghorbani Nima
Osman Ahmed A. A.
Pavlakos Georgios
Tzionas Dimitrios
Publication venue
Publication date: 01/01/2019
Field of study

To facilitate the analysis of human actions, interactions and emotions, we compute a 3D model of human body pose, hand pose, and facial expression from a single monocular image. To achieve this, we use thousands of 3D scans to train a new, unified, 3D model of the human body, SMPL-X, that extends SMPL with fully articulated hands and an expressive face. Learning to regress the parameters of SMPL-X directly from images is challenging without paired images and 3D ground truth. Consequently, we follow the approach of SMPLify, which estimates 2D features and then optimizes model parameters to fit the features. We improve on SMPLify in several significant ways: (1) we detect 2D features corresponding to the face, hands, and feet and fit the full SMPL-X model to these; (2) we train a new neural network pose prior using a large MoCap dataset; (3) we define a new interpenetration penalty that is both fast and accurate; (4) we automatically detect gender and the appropriate body models (male, female, or neutral); (5) our PyTorch implementation achieves a speedup of more than 8x over Chumpy. We use the new method, SMPLify-X, to fit SMPL-X to both controlled images and images in the wild. We evaluate 3D accuracy on a new curated dataset comprising 100 images with pseudo ground-truth. This is a step towards automatic expressive human capture from monocular RGB data. The models, code, and data are available for research purposes at https://smpl-x.is.tue.mpg.de.Comment: To appear in CVPR 201

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Low-cost,stand-off, 2D+3D face imaging for biometric identification using Fourier transform profilometry –Update

Author: D\u27Alberto Tiffanie
Grow Taylor
Hinckley Michael
Hoening Michael
Höft Thomas
Kubala Kelli
McCumber Paul
Novotny John
Redman Brian
Rogers Nathan
Rudd Van
Shald Scott
Sibell Russell
Uberna Radek
Wheeler Frederick W.
Woody Nathan
Publication venue: UST Research Online
Publication date: 01/09/2010
Field of study

Lockheed Martin Coherent Technologies is developing laser-based technologies for stand-off 2D+3D face imaging for biometric identification. Among other potential industrial, commercial, and governmental users, the Department of Homeland Security (DHS) and the Department of Defense (DoD) desire the ability to capture biometric data from minimally cooperative subjects with a minimally invasive system at stand-off distances. The initial applications are fixed installations for relatively large volume access points such as security check points and transportation gateways for which minimal cooperation, stand-off operation, and real-time operation are desired so that the biometric identification process will have little impact on traffic flow. Last year we presented a paper on the development and testing of a 2D+3D face imager breadboard based on th

University of St. Thomas, Minnesota