4,057 research outputs found
Improving Facial Analysis and Performance Driven Animation through Disentangling Identity and Expression
We present techniques for improving performance driven facial animation,
emotion recognition, and facial key-point or landmark prediction using learned
identity invariant representations. Established approaches to these problems
can work well if sufficient examples and labels for a particular identity are
available and factors of variation are highly controlled. However, labeled
examples of facial expressions, emotions and key-points for new individuals are
difficult and costly to obtain. In this paper we improve the ability of
techniques to generalize to new and unseen individuals by explicitly modeling
previously seen variations related to identity and expression. We use a
weakly-supervised approach in which identity labels are used to learn the
different factors of variation linked to identity separately from factors
related to expression. We show how probabilistic modeling of these sources of
variation allows one to learn identity-invariant representations for
expressions which can then be used to identity-normalize various procedures for
facial expression analysis and animation control. We also show how to extend
the widely used techniques of active appearance models and constrained local
models through replacing the underlying point distribution models which are
typically constructed using principal component analysis with
identity-expression factorized representations. We present a wide variety of
experiments in which we consistently improve performance on emotion
recognition, markerless performance-driven facial animation and facial
key-point tracking.Comment: to appear in Image and Vision Computing Journal (IMAVIS
Recommended from our members
Conditional Regressive Random Forest Stereo-based Hand Depth Recovery
This paper introduces Conditional Regressive Random Forest (CRRF), a novel method that combines a closed-form Conditional Random Field (CRF), using learned weights, and a Regressive Random Forest (RRF) that employs adaptively selected expert trees. CRRF is used to estimate a depth image of hand given stereo RGB inputs. CRRF uses a novel superpixel-based regression framework that takes advantage of the smoothness of the hand’s depth surface. A RRF unary term adaptively selects different stereo-matching measures as it implicitly determines matching pixels in a coarse-to-fine manner. CRRF also includes a pair-wise term that encourages smoothness between similar adjacent superpixels. Experimental results show that CRRF can produce high quality depth maps, even using an inexpensive RGB stereo camera and produces state-of-the-art results for hand depth estimation
Tex2Shape: Detailed Full Human Body Geometry From a Single Image
We present a simple yet effective method to infer detailed full human body shape from only a single photograph. Our model can infer full-body shape including face, hair, and clothing including wrinkles at interactive frame-rates. Results feature details even on parts that are occluded in the input image. Our main idea is to turn shape regression into an aligned image-to-image translation problem. The input to our method is a partial texture map of the visible region obtained from off-the-shelf methods. From a partial texture, we estimate detailed normal and vector displacement maps, which can be applied to a low-resolution smooth body model to add detail and clothing. Despite being trained purely with synthetic data, our model generalizes well to real-world photographs. Numerous results demonstrate the versatility and robustness of our method
Tex2Shape: Detailed Full Human Body Geometry From a Single Image
We present a simple yet effective method to infer detailed full human body
shape from only a single photograph. Our model can infer full-body shape
including face, hair, and clothing including wrinkles at interactive
frame-rates. Results feature details even on parts that are occluded in the
input image. Our main idea is to turn shape regression into an aligned
image-to-image translation problem. The input to our method is a partial
texture map of the visible region obtained from off-the-shelf methods. From a
partial texture, we estimate detailed normal and vector displacement maps,
which can be applied to a low-resolution smooth body model to add detail and
clothing. Despite being trained purely with synthetic data, our model
generalizes well to real-world photographs. Numerous results demonstrate the
versatility and robustness of our method
Recovering facial shape using a statistical model of surface normal direction
In this paper, we show how a statistical model of facial shape can be embedded within a shape-from-shading algorithm. We describe how facial shape can be captured using a statistical model of variations in surface normal direction. To construct this model, we make use of the azimuthal equidistant projection to map the distribution of surface normals from the polar representation on a unit sphere to Cartesian points on a local tangent plane. The distribution of surface normal directions is captured using the covariance matrix for the projected point positions. The eigenvectors of the covariance matrix define the modes of shape-variation in the fields of transformed surface normals. We show how this model can be trained using surface normal data acquired from range images and how to fit the model to intensity images of faces using constraints on the surface normal direction provided by Lambert's law. We demonstrate that the combination of a global statistical constraint and local irradiance constraint yields an efficient and accurate approach to facial shape recovery and is capable of recovering fine local surface details. We assess the accuracy of the technique on a variety of images with ground truth and real-world images
- …