31 research outputs found
Gazedirector: Fully articulated eye gaze redirection in video
We present GazeDirector, a new approach for eye gaze redirection that uses model-fitting. Our method first tracks the eyes by fitting a multi-part eye region model to video frames using analysis-by-synthesis, thereby recovering eye region shape, texture, pose, and gaze simultaneously. It then redirects gaze by 1) warping the eyelids from the original image using a model-derived flow field, and 2) rendering and compositing synthesized 3D eyeballs onto the output image in a photorealistic manner. GazeDirector allows us to change where people are looking without person-specific training data, and with full articulation, i.e. we can precisely specify new gaze directions in 3D. Quantitatively, we evaluate both model-fitting and gaze synthesis, with experiments for gaze estimation and redirection on the Columbia gaze dataset. Qualitatively, we compare GazeDirector against recent work on gaze redirection, showing better results especially for large redirection angles. Finally, we demonstrate gaze redirection on YouTube videos by introducing new 3D gaze targets and by manipulating visual behavior
ReDirTrans: Latent-to-Latent Translation for Gaze and Head Redirection
Learning-based gaze estimation methods require large amounts of training data
with accurate gaze annotations. Facing such demanding requirements of gaze data
collection and annotation, several image synthesis methods were proposed, which
successfully redirected gaze directions precisely given the assigned
conditions. However, these methods focused on changing gaze directions of the
images that only include eyes or restricted ranges of faces with low resolution
(less than ) to largely reduce interference from other attributes
such as hairs, which limits application scenarios. To cope with this
limitation, we proposed a portable network, called ReDirTrans, achieving
latent-to-latent translation for redirecting gaze directions and head
orientations in an interpretable manner. ReDirTrans projects input latent
vectors into aimed-attribute embeddings only and redirects these embeddings
with assigned pitch and yaw values. Then both the initial and edited embeddings
are projected back (deprojected) to the initial latent space as residuals to
modify the input latent vectors by subtraction and addition, representing old
status removal and new status addition. The projection of aimed attributes only
and subtraction-addition operations for status replacement essentially mitigate
impacts on other attributes and the distribution of latent vectors. Thus, by
combining ReDirTrans with a pretrained fixed e4e-StyleGAN pair, we created
ReDirTrans-GAN, which enables accurately redirecting gaze in full-face images
with resolution while preserving other attributes such as
identity, expression, and hairstyle. Furthermore, we presented improvements for
the downstream learning-based gaze estimation task, using redirected samples as
dataset augmentation
CUDA-GR: Controllable Unsupervised Domain Adaptation for Gaze Redirection
The aim of gaze redirection is to manipulate the gaze in an image to the
desired direction. However, existing methods are inadequate in generating
perceptually reasonable images. Advancement in generative adversarial networks
has shown excellent results in generating photo-realistic images. Though, they
still lack the ability to provide finer control over different image
attributes. To enable such fine-tuned control, one needs to obtain ground truth
annotations for the training data which can be very expensive. In this paper,
we propose an unsupervised domain adaptation framework, called CUDA-GR, that
learns to disentangle gaze representations from the labeled source domain and
transfers them to an unlabeled target domain. Our method enables fine-grained
control over gaze directions while preserving the appearance information of the
person. We show that the generated image-labels pairs in the target domain are
effective in knowledge transfer and can boost the performance of the downstream
tasks. Extensive experiments on the benchmarking datasets show that the
proposed method can outperform state-of-the-art techniques in both quantitative
and qualitative evaluation
GazeNeRF: 3D-Aware Gaze Redirection with Neural Radiance Fields
We propose GazeNeRF, a 3D-aware method for the task of gaze redirection.
Existing gaze redirection methods operate on 2D images and struggle to generate
3D consistent results. Instead, we build on the intuition that the face region
and eyeballs are separate 3D structures that move in a coordinated yet
independent fashion. Our method leverages recent advancements in conditional
image-based neural radiance fields and proposes a two-stream architecture that
predicts volumetric features for the face and eye regions separately. Rigidly
transforming the eye features via a 3D rotation matrix provides fine-grained
control over the desired gaze angle. The final, redirected image is then
attained via differentiable volume compositing. Our experiments show that this
architecture outperforms naively conditioned NeRF baselines as well as previous
state-of-the-art 2D gaze redirection methods in terms of redirection accuracy
and identity preservation