1,860 research outputs found

    HeadOn: Real-time Reenactment of Human Portrait Videos

    Get PDF
    We propose HeadOn, the first real-time source-to-target reenactment approach for complete human portrait videos that enables transfer of torso and head motion, face expression, and eye gaze. Given a short RGB-D video of the target actor, we automatically construct a personalized geometry proxy that embeds a parametric head, eye, and kinematic torso model. A novel real-time reenactment algorithm employs this proxy to photo-realistically map the captured motion from the source actor to the target actor. On top of the coarse geometric proxy, we propose a video-based rendering technique that composites the modified target portrait video via view- and pose-dependent texturing, and creates photo-realistic imagery of the target actor under novel torso and head poses, facial expressions, and gaze directions. To this end, we propose a robust tracking of the face and torso of the source actor. We extensively evaluate our approach and show significant improvements in enabling much greater flexibility in creating realistic reenacted output videos.Comment: Video: https://www.youtube.com/watch?v=7Dg49wv2c_g Presented at Siggraph'1

    Gazedirector: Fully articulated eye gaze redirection in video

    Get PDF
    We present GazeDirector, a new approach for eye gaze redirection that uses model-fitting. Our method first tracks the eyes by fitting a multi-part eye region model to video frames using analysis-by-synthesis, thereby recovering eye region shape, texture, pose, and gaze simultaneously. It then redirects gaze by 1) warping the eyelids from the original image using a model-derived flow field, and 2) rendering and compositing synthesized 3D eyeballs onto the output image in a photorealistic manner. GazeDirector allows us to change where people are looking without person-specific training data, and with full articulation, i.e. we can precisely specify new gaze directions in 3D. Quantitatively, we evaluate both model-fitting and gaze synthesis, with experiments for gaze estimation and redirection on the Columbia gaze dataset. Qualitatively, we compare GazeDirector against recent work on gaze redirection, showing better results especially for large redirection angles. Finally, we demonstrate gaze redirection on YouTube videos by introducing new 3D gaze targets and by manipulating visual behavior

    CUDA-GR: Controllable Unsupervised Domain Adaptation for Gaze Redirection

    Full text link
    The aim of gaze redirection is to manipulate the gaze in an image to the desired direction. However, existing methods are inadequate in generating perceptually reasonable images. Advancement in generative adversarial networks has shown excellent results in generating photo-realistic images. Though, they still lack the ability to provide finer control over different image attributes. To enable such fine-tuned control, one needs to obtain ground truth annotations for the training data which can be very expensive. In this paper, we propose an unsupervised domain adaptation framework, called CUDA-GR, that learns to disentangle gaze representations from the labeled source domain and transfers them to an unlabeled target domain. Our method enables fine-grained control over gaze directions while preserving the appearance information of the person. We show that the generated image-labels pairs in the target domain are effective in knowledge transfer and can boost the performance of the downstream tasks. Extensive experiments on the benchmarking datasets show that the proposed method can outperform state-of-the-art techniques in both quantitative and qualitative evaluation

    High-Fidelity Eye Animatable Neural Radiance Fields for Human Face

    Full text link
    Face rendering using neural radiance fields (NeRF) is a rapidly developing research area in computer vision. While recent methods primarily focus on controlling facial attributes such as identity and expression, they often overlook the crucial aspect of modeling eyeball rotation, which holds importance for various downstream tasks. In this paper, we aim to learn a face NeRF model that is sensitive to eye movements from multi-view images. We address two key challenges in eye-aware face NeRF learning: how to effectively capture eyeball rotation for training and how to construct a manifold for representing eyeball rotation. To accomplish this, we first fit FLAME, a well-established parametric face model, to the multi-view images considering multi-view consistency. Subsequently, we introduce a new Dynamic Eye-aware NeRF (DeNeRF). DeNeRF transforms 3D points from different views into a canonical space to learn a unified face NeRF model. We design an eye deformation field for the transformation, including rigid transformation, e.g., eyeball rotation, and non-rigid transformation. Through experiments conducted on the ETH-XGaze dataset, we demonstrate that our model is capable of generating high-fidelity images with accurate eyeball rotation and non-rigid periocular deformation, even under novel viewing angles. Furthermore, we show that utilizing the rendered images can effectively enhance gaze estimation performance.Comment: Under revie

    A Differential Approach for Gaze Estimation

    Full text link
    Non-invasive gaze estimation methods usually regress gaze directions directly from a single face or eye image. However, due to important variabilities in eye shapes and inner eye structures amongst individuals, universal models obtain limited accuracies and their output usually exhibit high variance as well as biases which are subject dependent. Therefore, increasing accuracy is usually done through calibration, allowing gaze predictions for a subject to be mapped to his/her actual gaze. In this paper, we introduce a novel image differential method for gaze estimation. We propose to directly train a differential convolutional neural network to predict the gaze differences between two eye input images of the same subject. Then, given a set of subject specific calibration images, we can use the inferred differences to predict the gaze direction of a novel eye sample. The assumption is that by allowing the comparison between two eye images, annoyance factors (alignment, eyelid closing, illumination perturbations) which usually plague single image prediction methods can be much reduced, allowing better prediction altogether. Experiments on 3 public datasets validate our approach which constantly outperforms state-of-the-art methods even when using only one calibration sample or when the latter methods are followed by subject specific gaze adaptation.Comment: Extension to our paper A differential approach for gaze estimation with calibration (BMVC 2018) Submitted to PAMI on Aug. 7th, 2018 Accepted by PAMI short on Dec. 2019, in IEEE Transactions on Pattern Analysis and Machine Intelligenc

    Effects of Character Guide in Immersive Virtual Reality Stories

    Get PDF
    Bringing cinematic experiences from traditional film screens into Virtual Reality (VR) has become an increasingly popular form of entertainment in recent years. VR provides viewers unprecedented film experience that allows them to freely explore around the environment and even interact with virtual props and characters. For the audience, this kind of experience raises their sense of presence in a different world, and may even stimulate their full immersion in story scenarios. However, different from traditional film-making, where the audience is completely passive in following along director’s decisions of storytelling, more freedom in VR might cause viewers to get lost on halfway watching a series of events that build up a story. Therefore, striking a balance between user interaction and narrative progression is a big challenge for filmmakers. To assist in organizing the research space, we presented a media review and the resulting framework to characterize the primary differences among different variations of film, media, games, and VR storytelling. The evaluation in particular provided us with knowledge that were closely associated with story-progression strategies and gaze redirection methods for interactive content in the commercial domain. Following the existing VR storytelling framework, we then approached the problem of guiding the audience through the major events of a story by introducing a virtual character as a travel companion who provides assistance in directing the viewer’s focus to the target scenes. The presented research explored a new technique that allowed a separate virtual character to be overlaid on top of an existing 360-degree video such that the added character react based on the head-tracking data to help indicate to the viewer the core focal content of the story. The motivation behind this research is to assist directors in using a virtual guiding character to increase the effectiveness of VR storytelling, assuring that viewers fully understand the story through completing a sequence of events, and possibly realize a rich literary experience. To assess the effectiveness of this technique, we performed a controlled experiment by applying the method in three immersive narrative experiences, each with a control condition that was free ii from guidance. The experiment compared three variations of the character guide: 1) no guide; 2) a guide with an art style similar to the style of the video design; and 3) a character guide with a dissimilar style. All participants viewed the narrative experiences to test whether a similar art style led to better gaze behaviors that had higher likelihood of falling on the intended focus regions of the 360-degree range of the Virtual Environment (VE). By the end of the experiment, we concluded that adding a virtual character that was independent from the narrative had limited effects on users’ gaze performances when watching an interactive story in VR. Furthermore, the implemented character’s art style made very few difference to users’ gaze performance as well as their level of viewing satisfaction. The primary reason could be due to limitation of the implementation design. Besides this, the guiding body language designed for an animal character caused certain confusion for numerous participants viewing the stories. In the end, the character guide approaches still provided insights for future directors and designers into how to draw the viewers’ attention to a target point within a narrative VE, including what can work well and what should be avoide

    ReDirTrans: Latent-to-Latent Translation for Gaze and Head Redirection

    Full text link
    Learning-based gaze estimation methods require large amounts of training data with accurate gaze annotations. Facing such demanding requirements of gaze data collection and annotation, several image synthesis methods were proposed, which successfully redirected gaze directions precisely given the assigned conditions. However, these methods focused on changing gaze directions of the images that only include eyes or restricted ranges of faces with low resolution (less than 128Ă—128128\times128) to largely reduce interference from other attributes such as hairs, which limits application scenarios. To cope with this limitation, we proposed a portable network, called ReDirTrans, achieving latent-to-latent translation for redirecting gaze directions and head orientations in an interpretable manner. ReDirTrans projects input latent vectors into aimed-attribute embeddings only and redirects these embeddings with assigned pitch and yaw values. Then both the initial and edited embeddings are projected back (deprojected) to the initial latent space as residuals to modify the input latent vectors by subtraction and addition, representing old status removal and new status addition. The projection of aimed attributes only and subtraction-addition operations for status replacement essentially mitigate impacts on other attributes and the distribution of latent vectors. Thus, by combining ReDirTrans with a pretrained fixed e4e-StyleGAN pair, we created ReDirTrans-GAN, which enables accurately redirecting gaze in full-face images with 1024Ă—10241024\times1024 resolution while preserving other attributes such as identity, expression, and hairstyle. Furthermore, we presented improvements for the downstream learning-based gaze estimation task, using redirected samples as dataset augmentation
    • …
    corecore