5,789 research outputs found
HeadOn: Real-time Reenactment of Human Portrait Videos
We propose HeadOn, the first real-time source-to-target reenactment approach
for complete human portrait videos that enables transfer of torso and head
motion, face expression, and eye gaze. Given a short RGB-D video of the target
actor, we automatically construct a personalized geometry proxy that embeds a
parametric head, eye, and kinematic torso model. A novel real-time reenactment
algorithm employs this proxy to photo-realistically map the captured motion
from the source actor to the target actor. On top of the coarse geometric
proxy, we propose a video-based rendering technique that composites the
modified target portrait video via view- and pose-dependent texturing, and
creates photo-realistic imagery of the target actor under novel torso and head
poses, facial expressions, and gaze directions. To this end, we propose a
robust tracking of the face and torso of the source actor. We extensively
evaluate our approach and show significant improvements in enabling much
greater flexibility in creating realistic reenacted output videos.Comment: Video: https://www.youtube.com/watch?v=7Dg49wv2c_g Presented at
Siggraph'1
HIGH QUALITY HUMAN 3D BODY MODELING, TRACKING AND APPLICATION
Geometric reconstruction of dynamic objects is a fundamental task of computer vision and graphics, and modeling human body of high fidelity is considered to be a core of this problem. Traditional human shape and motion capture techniques require an array of surrounding cameras or subjects wear reflective markers, resulting in a limitation of working space and portability. In this dissertation, a complete process is designed from geometric modeling detailed 3D human full body and capturing shape dynamics over time using a flexible setup to guiding clothes/person re-targeting with such data-driven models. As the mechanical movement of human body can be considered as an articulate motion, which is easy to guide the skin animation but has difficulties in the reverse process to find parameters from images without manual intervention, we present a novel parametric model, GMM-BlendSCAPE, jointly taking both linear skinning model and the prior art of BlendSCAPE (Blend Shape Completion and Animation for PEople) into consideration and develop a Gaussian Mixture Model (GMM) to infer both body shape and pose from incomplete observations. We show the increased accuracy of joints and skin surface estimation using our model compared to the skeleton based motion tracking. To model the detailed body, we start with capturing high-quality partial 3D scans by using a single-view commercial depth camera. Based on GMM-BlendSCAPE, we can then reconstruct multiple complete static models of large pose difference via our novel non-rigid registration algorithm. With vertex correspondences established, these models can be further converted into a personalized drivable template and used for robust pose tracking in a similar GMM framework. Moreover, we design a general purpose real-time non-rigid deformation algorithm to accelerate this registration. Last but not least, we demonstrate a novel virtual clothes try-on application based on our personalized model utilizing both image and depth cues to synthesize and re-target clothes for single-view videos of different people
VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera
We present the first real-time method to capture the full global 3D skeletal
pose of a human in a stable, temporally consistent manner using a single RGB
camera. Our method combines a new convolutional neural network (CNN) based pose
regressor with kinematic skeleton fitting. Our novel fully-convolutional pose
formulation regresses 2D and 3D joint positions jointly in real time and does
not require tightly cropped input frames. A real-time kinematic skeleton
fitting method uses the CNN output to yield temporally stable 3D global pose
reconstructions on the basis of a coherent kinematic skeleton. This makes our
approach the first monocular RGB method usable in real-time applications such
as 3D character control---thus far, the only monocular methods for such
applications employed specialized RGB-D cameras. Our method's accuracy is
quantitatively on par with the best offline 3D monocular RGB pose estimation
methods. Our results are qualitatively comparable to, and sometimes better
than, results from monocular RGB-D approaches, such as the Kinect. However, we
show that our approach is more broadly applicable than RGB-D solutions, i.e. it
works for outdoor scenes, community videos, and low quality commodity RGB
cameras.Comment: Accepted to SIGGRAPH 201
3D Human Face Reconstruction and 2D Appearance Synthesis
3D human face reconstruction has been an extensive research for decades due to its wide applications, such as animation, recognition and 3D-driven appearance synthesis. Although commodity depth sensors are widely available in recent years, image based face reconstruction are significantly valuable as images are much easier to access and store.
In this dissertation, we first propose three image-based face reconstruction approaches according to different assumption of inputs.
In the first approach, face geometry is extracted from multiple key frames of a video sequence with different head poses. The camera should be calibrated under this assumption.
As the first approach is limited to videos, we propose the second approach then focus on single image. This approach also improves the geometry by adding fine grains using shading cue. We proposed a novel albedo estimation and linear optimization algorithm in this approach.
In the third approach, we further loose the constraint of the input image to arbitrary in the wild images. Our proposed approach can robustly reconstruct high quality model even with extreme expressions and large poses.
We then explore the applicability of our face reconstructions on four interesting applications: video face beautification, generating personalized facial blendshape from image sequences, face video stylizing and video face replacement. We demonstrate great potentials of our reconstruction approaches on these real-world applications. In particular, with the recent surge of interests in VR/AR, it is increasingly common to see people wearing head-mounted displays. However, the large occlusion on face is a big obstacle for people to communicate in a face-to-face manner. Our another application is that we explore hardware/software solutions for synthesizing the face image with presence of HMDs. We design two setups (experimental and mobile) which integrate two near IR cameras and one color camera to solve this problem. With our algorithm and prototype, we can achieve photo-realistic results.
We further propose a deep neutral network to solve the HMD removal problem considering it as a face inpainting problem. This approach doesn\u27t need special hardware and run in real-time with satisfying results
{3D} Morphable Face Models -- Past, Present and Future
In this paper, we provide a detailed survey of 3D Morphable Face Models over the 20 years since they were first proposed. The challenges in building and applying these models, namely capture, modeling, image formation, and image analysis, are still active research topics, and we review the state-of-the-art in each of these areas. We also look ahead, identifying unsolved challenges, proposing directions for future research and highlighting the broad range of current and future applications
Recent trends, technical concepts and components of computer-assisted orthopedic surgery systems: A comprehensive review
Computer-assisted orthopedic surgery (CAOS) systems have become one of the most important and challenging types of system in clinical orthopedics, as they enable precise treatment of musculoskeletal diseases, employing modern clinical navigation systems and surgical tools. This paper brings a comprehensive review of recent trends and possibilities of CAOS systems. There are three types of the surgical planning systems, including: systems based on the volumetric images (computer tomography (CT), magnetic resonance imaging (MRI) or ultrasound images), further systems utilize either 2D or 3D fluoroscopic images, and the last one utilizes the kinetic information about the joints and morphological information about the target bones. This complex review is focused on three fundamental aspects of CAOS systems: their essential components, types of CAOS systems, and mechanical tools used in CAOS systems. In this review, we also outline the possibilities for using ultrasound computer-assisted orthopedic surgery (UCAOS) systems as an alternative to conventionally used CAOS systems.Web of Science1923art. no. 519
LiveCap: Real-time Human Performance Capture from Monocular Video
We present the first real-time human performance capture approach that
reconstructs dense, space-time coherent deforming geometry of entire humans in
general everyday clothing from just a single RGB video. We propose a novel
two-stage analysis-by-synthesis optimization whose formulation and
implementation are designed for high performance. In the first stage, a skinned
template model is jointly fitted to background subtracted input video, 2D and
3D skeleton joint positions found using a deep neural network, and a set of
sparse facial landmark detections. In the second stage, dense non-rigid 3D
deformations of skin and even loose apparel are captured based on a novel
real-time capable algorithm for non-rigid tracking using dense photometric and
silhouette constraints. Our novel energy formulation leverages automatically
identified material regions on the template to model the differing non-rigid
deformation behavior of skin and apparel. The two resulting non-linear
optimization problems per-frame are solved with specially-tailored
data-parallel Gauss-Newton solvers. In order to achieve real-time performance
of over 25Hz, we design a pipelined parallel architecture using the CPU and two
commodity GPUs. Our method is the first real-time monocular approach for
full-body performance capture. Our method yields comparable accuracy with
off-line performance capture techniques, while being orders of magnitude
faster
- …