input depth map & 2D features data-driven tracking our tracking data-driven retargeting our retargeting Figure 1: Our adaptive tracking model conforms to the input expressions on-the-fly, producing a better fit to the user than state-of-the-art data driven techniques [Weise et al. 2011] which are confined to learned motion priors and generate plausible but not accurate tracking. We introduce a real-time and calibration-free facial performance capture framework based on a sensor with video and depth input. In this framework, we develop an adaptive PCA model using shape correctives that adjust on-the-fly to the actor’s expressions through incremental PCA-based learning. Since the fitting of the adaptive model progressively improves during the performance, we do not require an extra capture or training session to build this model. As a result, the system is highly deployable and easy to use: it can faithfully track any individual, starting from just a single face scan of the subject in a neutral pose. Like many real-time methods, we use a linear subspace to cope with incomplete input data and fast motion. To boost the training of our tracking model with reliable samples, we use a well-trained 2D facial feature tracker on the input video and an efficient mesh deformation algorithm to snap the result of the previous step to high frequency details in visible depth map regions. We show that the combination of dense depth maps and texture features around eyes and lips is essential in capturing natural dialogues and nuanced actor-specific emotions. We demonstrate that using an adaptive PCA model not only improves the fitting accuracy for tracking but also increases the expressiveness of the retargeted character
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.