105 research outputs found
Video face replacement
We present a method for replacing facial performances in video. Our approach accounts for differences in identity, visual appearance, speech, and timing between source and target videos. Unlike prior work, it does not require substantial manual operation or complex acquisition hardware, only single-camera video. We use a 3D multilinear model to track the facial performance in both videos. Using the corresponding 3D geometry, we warp the source to the target face and retime the source to match the target performance. We then compute an optimal seam through the video volume that maintains temporal consistency in the final composite. We showcase the use of our method on a variety of examples and present the result of a user study that suggests our results are difficult to distinguish from real video footage.National Science Foundation (U.S.) (Grant PHY-0835713)National Science Foundation (U.S.) (Grant DMS-0739255
Applications of Face Analysis and Modeling in Media Production
Facial expressions play an important role in day-by-day communication as well as media production. This article surveys automatic facial analysis and modeling methods using computer vision techniques and their applications for media production. The authors give a brief overview of the psychology of face perception and then describe some of the applications of computer vision and pattern recognition applied to face recognition in media production. This article also covers the automatic generation of face models, which are used in movie and TV productions for special effects in order to manipulate people's faces or combine real actors with computer graphics
Recommended from our members
Multi-Scale Capture of Facial Geometry and Motion
We present a novel multi-scale representation and acquisition method for the animation of high-resolution facial geometry and wrinkles. We first acquire a static scan of the face including reflectance data at the highest possible quality. We then augment a traditional marker-based facial motion-capture system by two synchronized video cameras to track expression wrinkles. The resulting model consists of high-resolution geometry, motion-capture data, and expression wrinkles in 2D parametric form. This combination represents the facial shape and its salient features at multiple scales. During motion synthesis the motion-capture data deforms the high-resolution geometry using a linear shell-based mesh-deformation method. The wrinkle geometry is added to the facial base mesh using nonlinear energy optimization. We present the results of our approach for performance replay as well as for wrinkle editing.Engineering and Applied Science
Surface Normal Deconvolution: Photometric Stereo for Optically Thick Translucent Objects
Computer Vision – ECCV 2014
13th European Conference, Zurich, Switzerland, September 6-12, 2014,This paper presents a photometric stereo method that works for optically thick translucent objects exhibiting subsurface scattering. Our method is built upon the previous studies showing that subsurface scattering is approximated as convolution with a blurring kernel. We extend this observation and show that the original surface normal convolved with the scattering kernel corresponds to the blurred surface normal that can be obtained by a conventional photometric stereo technique. Based on this observation, we cast the photometric stereo problem for optically thick translucent objects as a deconvolution problem, and develop a method to recover accurate surface normals. Experimental results of both synthetic and real-world scenes show the effectiveness of the proposed method
FaceVR: Real-Time Facial Reenactment and Eye Gaze Control in Virtual Reality
We introduce FaceVR, a novel method for gaze-aware facial reenactment in the Virtual Reality (VR) context. The key component of FaceVR is a robust algorithm to perform real-time facial motion capture of an actor who is wearing a head-mounted display (HMD), as well as a new data-driven approach for eye tracking from monocular videos. In addition to these face reconstruction components, FaceVR incorporates photo-realistic re-rendering in real time, thus allowing artificial modifications of face and eye appearances. For instance, we can alter facial expressions, change gaze directions, or remove the VR goggles in realistic re-renderings. In a live setup with a source and a target actor, we apply these newly-introduced algorithmic components. We assume that the source actor is wearing a VR device, and we capture his facial expressions and eye movement in real-time. For the target video, we mimic a similar tracking process; however, we use the source input to drive the animations of the target video, thus enabling gaze-aware facial reenactment. To render the modified target video on a stereo display, we augment our capture and reconstruction process with stereo data. In the end, FaceVR produces compelling results for a variety of applications, such as gaze-aware facial reenactment, reenactment in virtual reality, removal of VR goggles, and re-targeting of somebody's gaze direction in a video conferencing call
Separable Subsurface Scattering
In this paper, we propose two real-time models for simulating subsurface scattering for a large variety of translucent materials, which need under 0.5 ms per frame to execute. This makes them a practical option for real-time production scenarios. Current state-of-the-art, real-time approaches simulate subsurface light transport by approximating the radially symmetric non-separable diffusion kernel with a sum of separable Gaussians, which requires multiple (up to 12) 1D convolutions. In this work we relax the requirement of radial symmetry to approximate a 2D diffuse reflectance profile by a single separable kernel. We first show that low-rank approximations based on matrix factorization outperform previous approaches, but they still need several passes to get good results. To solve this, we present two different separable models: the first one yields a high-quality diffusion simulation, while the second one offers an attractive trade-off between physical accuracy and artistic control. Both allow rendering of subsurface scattering using only two 1D convolutions, reducing both execution time and memory consumption, while delivering results comparable to techniques with higher cost. Using our importance-sampling and jittering strategies, only seven samples per pixel are required. Our methods can be implemented as simple post-processing steps without intrusive changes to existing rendering pipelines
Applications of face analysis and modelling in media production:Overview of the state of the art
Facial expressions play an important role in day-by-day communication as well as media production. This article surveys automatic facial analysis and modeling methods using computer vision techniques and their applications for media production. The authors give a brief overview of the psychology of face perception and then describe some of the applications of computer vision and pattern recognition applied to face recognition in media production. This article also covers the automatic generation of face models, which are used in movie and TV productions for special effects in order to manipulate people's faces or combine real actors with computer graphics
Text-Guided Generation and Editing of Compositional 3D Avatars
Our goal is to create a realistic 3D facial avatar with hair and accessories
using only a text description. While this challenge has attracted significant
recent interest, existing methods either lack realism, produce unrealistic
shapes, or do not support editing, such as modifications to the hairstyle. We
argue that existing methods are limited because they employ a monolithic
modeling approach, using a single representation for the head, face, hair, and
accessories. Our observation is that the hair and face, for example, have very
different structural qualities that benefit from different representations.
Building on this insight, we generate avatars with a compositional model, in
which the head, face, and upper body are represented with traditional 3D
meshes, and the hair, clothing, and accessories with neural radiance fields
(NeRF). The model-based mesh representation provides a strong geometric prior
for the face region, improving realism while enabling editing of the person's
appearance. By using NeRFs to represent the remaining components, our method is
able to model and synthesize parts with complex geometry and appearance, such
as curly hair and fluffy scarves. Our novel system synthesizes these
high-quality compositional avatars from text descriptions. The experimental
results demonstrate that our method, Text-guided generation and Editing of
Compositional Avatars (TECA), produces avatars that are more realistic than
those of recent methods while being editable because of their compositional
nature. For example, our TECA enables the seamless transfer of compositional
features like hairstyles, scarves, and other accessories between avatars. This
capability supports applications such as virtual try-on.Comment: Home page: https://yfeng95.github.io/tec
- …