244 research outputs found
DeepSketch2Face: A Deep Learning Based Sketching System for 3D Face and Caricature Modeling
Face modeling has been paid much attention in the field of visual computing.
There exist many scenarios, including cartoon characters, avatars for social
media, 3D face caricatures as well as face-related art and design, where
low-cost interactive face modeling is a popular approach especially among
amateur users. In this paper, we propose a deep learning based sketching system
for 3D face and caricature modeling. This system has a labor-efficient
sketching interface, that allows the user to draw freehand imprecise yet
expressive 2D lines representing the contours of facial features. A novel CNN
based deep regression network is designed for inferring 3D face models from 2D
sketches. Our network fuses both CNN and shape based features of the input
sketch, and has two independent branches of fully connected layers generating
independent subsets of coefficients for a bilinear face representation. Our
system also supports gesture based interactions for users to further manipulate
initial face models. Both user studies and numerical results indicate that our
sketching system can help users create face models quickly and effectively. A
significantly expanded face database with diverse identities, expressions and
levels of exaggeration is constructed to promote further research and
evaluation of face modeling techniques.Comment: 12 pages, 16 figures, to appear in SIGGRAPH 201
3d Face Reconstruction And Emotion Analytics With Part-Based Morphable Models
3D face reconstruction and facial expression analytics using 3D facial data are new
and hot research topics in computer graphics and computer vision. In this proposal, we first
review the background knowledge for emotion analytics using 3D morphable face model, including
geometry feature-based methods, statistic model-based methods and more advanced
deep learning-bade methods. Then, we introduce a novel 3D face modeling and reconstruction
solution that robustly and accurately acquires 3D face models from a couple of images
captured by a single smartphone camera. Two selfie photos of a subject taken from the
front and side are used to guide our Non-Negative Matrix Factorization (NMF) induced
part-based face model to iteratively reconstruct an initial 3D face of the subject. Then, an
iterative detail updating method is applied to the initial generated 3D face to reconstruct
facial details through optimizing lighting parameters and local depths. Our iterative 3D
face reconstruction method permits fully automatic registration of a part-based face representation
to the acquired face data and the detailed 2D/3D features to build a high-quality
3D face model. The NMF part-based face representation learned from a 3D face database
facilitates effective global and adaptive local detail data fitting alternatively. Our system
is flexible and it allows users to conduct the capture in any uncontrolled environment. We
demonstrate the capability of our method by allowing users to capture and reconstruct their
3D faces by themselves.
Based on the 3D face model reconstruction, we can analyze the facial expression and
the related emotion in 3D space. We present a novel approach to analyze the facial expressions
from images and a quantitative information visualization scheme for exploring this
type of visual data. From the reconstructed result using NMF part-based morphable 3D face
model, basis parameters and a displacement map are extracted as features for facial emotion
analysis and visualization. Based upon the features, two Support Vector Regressions (SVRs)
are trained to determine the fuzzy Valence-Arousal (VA) values to quantify the emotions.
The continuously changing emotion status can be intuitively analyzed by visualizing the
VA values in VA-space. Our emotion analysis and visualization system, based on 3D NMF
morphable face model, detects expressions robustly from various head poses, face sizes and
lighting conditions, and is fully automatic to compute the VA values from images or a sequence
of video with various facial expressions. To evaluate our novel method, we test our
system on publicly available databases and evaluate the emotion analysis and visualization
results. We also apply our method to quantifying emotion changes during motivational interviews.
These experiments and applications demonstrate effectiveness and accuracy of
our method.
In order to improve the expression recognition accuracy, we present a facial expression
recognition approach with 3D Mesh Convolutional Neural Network (3DMCNN) and a visual
analytics guided 3DMCNN design and optimization scheme. The geometric properties of the
surface is computed using the 3D face model of a subject with facial expressions. Instead of
using regular Convolutional Neural Network (CNN) to learn intensities of the facial images,
we convolve the geometric properties on the surface of the 3D model using 3DMCNN. We
design a geodesic distance-based convolution method to overcome the difficulties raised from
the irregular sampling of the face surface mesh. We further present an interactive visual
analytics for the purpose of designing and modifying the networks to analyze the learned
features and cluster similar nodes in 3DMCNN. By removing low activity nodes in the network,
the performance of the network is greatly improved. We compare our method with the regular CNN-based method by interactively visualizing each layer of the networks and
analyze the effectiveness of our method by studying representative cases. Testing on public
datasets, our method achieves a higher recognition accuracy than traditional image-based
CNN and other 3D CNNs. The presented framework, including 3DMCNN and interactive
visual analytics of the CNN, can be extended to other applications
GANFIT: Generative adversarial network fitting for high fidelity 3D face reconstruction
In the past few years, a lot of work has been done to- wards reconstructing the 3D facial structure from single images by capitalizing on the power of Deep Convolutional Neural Networks (DCNNs). In the most recent works, differentiable renderers were employed in order to learn the relationship between the facial identity features and the parameters of a 3D morphable model for shape and texture. The texture features either correspond to components of a linear texture space or are learned by auto-encoders directly from in-the-wild images. In all cases, the quality of the facial texture reconstruction of the state-of-the-art methods is still not capable of modeling textures in high fidelity. In this paper, we take a radically different approach and harness the power of Generative Adversarial Networks (GANs) and DCNNs in order to reconstruct the facial texture and shape from single images. That is, we utilize GANs to train a very powerful generator of facial texture in UV space. Then, we revisit the original 3D Morphable Models (3DMMs) fitting approaches making use of non-linear optimization to find the optimal latent parameters that best reconstruct the test image but under a new perspective. We optimize the parameters with the supervision of pretrained deep identity features through our end-to-end differentiable framework. We demonstrate excellent results in photorealistic and identity preserving 3D face reconstructions and achieve for the first time, to the best of our knowledge, facial texture reconstruction with high-frequency details
Editing faces in videos
Editing faces in movies is of interest in the special effects industry. We aim at
producing effects such as the addition of accessories interacting correctly with
the face or replacing the face of a stuntman with the face of the main actor.
The system introduced in this thesis is based on a 3D generative face model.
Using a 3D model makes it possible to edit the face in the semantic space of pose,
expression, and identity instead of pixel space, and due to its 3D nature allows
a modelling of the light interaction. In our system we first reconstruct the 3D
face, which is deforming because of expressions and speech, the lighting, and
the camera in all frames of a monocular input video. The face is then edited by
substituting expressions or identities with those of another video sequence or by
adding virtual objects into the scene. The manipulated 3D scene is rendered back
into the original video, correctly simulating the interaction of the light with the
deformed face and virtual objects.
We describe all steps necessary to build and apply the system. This includes
registration of training faces to learn a generative face model, semi-automatic
annotation of the input video, fitting of the face model to the input video, editing
of the fit, and rendering of the resulting scene.
While describing the application we introduce a host of new methods, each
of which is of interest on its own. We start with a new method to register 3D
face scans to use as training data for the face model. For video preprocessing a
new interest point tracking and 2D Active Appearance Model fitting technique
is proposed. For robust fitting we introduce background modelling, model-based
stereo techniques, and a more accurate light model
AFFECT-PRESERVING VISUAL PRIVACY PROTECTION
The prevalence of wireless networks and the convenience of mobile cameras enable many new video applications other than security and entertainment. From behavioral diagnosis to wellness monitoring, cameras are increasing used for observations in various educational and medical settings. Videos collected for such applications are considered protected health information under privacy laws in many countries. Visual privacy protection techniques, such as blurring or object removal, can be used to mitigate privacy concern, but they also obliterate important visual cues of affect and social behaviors that are crucial for the target applications. In this dissertation, we propose to balance the privacy protection and the utility of the data by preserving the privacy-insensitive information, such as pose and expression, which is useful in many applications involving visual understanding.
The Intellectual Merits of the dissertation include a novel framework for visual privacy protection by manipulating facial image and body shape of individuals, which: (1) is able to conceal the identity of individuals; (2) provide a way to preserve the utility of the data, such as expression and pose information; (3) balance the utility of the data and capacity of the privacy protection.
The Broader Impacts of the dissertation focus on the significance of privacy protection on visual data, and the inadequacy of current privacy enhancing technologies in preserving affect and behavioral attributes of the visual content, which are highly useful for behavior observation in educational and medical settings. This work in this dissertation represents one of the first attempts in achieving both goals simultaneously
- …