12,370 research outputs found
Improving Facial Analysis and Performance Driven Animation through Disentangling Identity and Expression
We present techniques for improving performance driven facial animation,
emotion recognition, and facial key-point or landmark prediction using learned
identity invariant representations. Established approaches to these problems
can work well if sufficient examples and labels for a particular identity are
available and factors of variation are highly controlled. However, labeled
examples of facial expressions, emotions and key-points for new individuals are
difficult and costly to obtain. In this paper we improve the ability of
techniques to generalize to new and unseen individuals by explicitly modeling
previously seen variations related to identity and expression. We use a
weakly-supervised approach in which identity labels are used to learn the
different factors of variation linked to identity separately from factors
related to expression. We show how probabilistic modeling of these sources of
variation allows one to learn identity-invariant representations for
expressions which can then be used to identity-normalize various procedures for
facial expression analysis and animation control. We also show how to extend
the widely used techniques of active appearance models and constrained local
models through replacing the underlying point distribution models which are
typically constructed using principal component analysis with
identity-expression factorized representations. We present a wide variety of
experiments in which we consistently improve performance on emotion
recognition, markerless performance-driven facial animation and facial
key-point tracking.Comment: to appear in Image and Vision Computing Journal (IMAVIS
Breathing Life into Faces: Speech-driven 3D Facial Animation with Natural Head Pose and Detailed Shape
The creation of lifelike speech-driven 3D facial animation requires a natural
and precise synchronization between audio input and facial expressions.
However, existing works still fail to render shapes with flexible head poses
and natural facial details (e.g., wrinkles). This limitation is mainly due to
two aspects: 1) Collecting training set with detailed 3D facial shapes is
highly expensive. This scarcity of detailed shape annotations hinders the
training of models with expressive facial animation. 2) Compared to mouth
movement, the head pose is much less correlated to speech content.
Consequently, concurrent modeling of both mouth movement and head pose yields
the lack of facial movement controllability. To address these challenges, we
introduce VividTalker, a new framework designed to facilitate speech-driven 3D
facial animation characterized by flexible head pose and natural facial
details. Specifically, we explicitly disentangle facial animation into head
pose and mouth movement and encode them separately into discrete latent spaces.
Then, these attributes are generated through an autoregressive process
leveraging a window-based Transformer architecture. To augment the richness of
3D facial animation, we construct a new 3D dataset with detailed shapes and
learn to synthesize facial details in line with speech content. Extensive
quantitative and qualitative experiments demonstrate that VividTalker
outperforms state-of-the-art methods, resulting in vivid and realistic
speech-driven 3D facial animation
3d Computer Modeling Of Human Mandible Motion With Application To Human Facial Motion
Computer facial modeling and animation has been an interest of computer graphics researchers for many years. This is not only because the face itself is an interesting object, but also because facial animation finds application in many other disciplines (for example, entertainment, medical education, telecommunication, psychology, medicine, and linguistics). Because the mandible motion plays a major role in modeling facial motion, its study is of significance to each of those disciplines as well. In addition, the mandible itself is an object of study in the area of clinical science.;Current facial movement models in computer animation have difficulty dealing with facial movements that are strongly determined by the mandible, such as chewing. This thesis proposes new computer models of the mandible that address this problem. First, a geometric mandible model is proposed to simulate typical motion features of the mandible such as opening, closing, protruding, and lateral shifting. While this model is simple and successful, it has drawbacks when applied to motions as complicated as chewing. Therefore, a physically-based model is also proposed to deal with these drawbacks. This physically-based mandible model is then integrated with a spring-based physical facial model to automatically simulate motions such as chewing
Final Report to NSF of the Standards for Facial Animation Workshop
The human face is an important and complex communication channel. It is a very familiar and sensitive object of human perception. The facial animation field has increased greatly in the past few years as fast computer graphics workstations have made the modeling and real-time animation of hundreds of thousands of polygons affordable and almost commonplace. Many applications have been developed such as teleconferencing, surgery, information assistance systems, games, and entertainment. To solve these different problems, different approaches for both animation control and modeling have been developed
Face modeling and animation language for MPEG-4 XMT framework
This paper proposes FML, an XML-based face modeling and animation language. FML provides a structured content description method for multimedia presentations based on face animation. The language can be used as direct input to compatible players, or be compiled within MPEG-4 XMT framework to create MPEG-4 presentations. The language allows parallel and sequential action description, decision-making and dynamic event-based scenarios, model configuration, and behavioral template definition. Facial actions include talking, expressions, head movements, and low-level MPEG-4 FAPs. The ShowFace and iFACE animation frameworks are also reviewed as example FML-based animation systems
DualTalker: A Cross-Modal Dual Learning Approach for Speech-Driven 3D Facial Animation
In recent years, audio-driven 3D facial animation has gained significant
attention, particularly in applications such as virtual reality, gaming, and
video conferencing. However, accurately modeling the intricate and subtle
dynamics of facial expressions remains a challenge. Most existing studies
approach the facial animation task as a single regression problem, which often
fail to capture the intrinsic inter-modal relationship between speech signals
and 3D facial animation and overlook their inherent consistency. Moreover, due
to the limited availability of 3D-audio-visual datasets, approaches learning
with small-size samples have poor generalizability that decreases the
performance. To address these issues, in this study, we propose a cross-modal
dual-learning framework, termed DualTalker, aiming at improving data usage
efficiency as well as relating cross-modal dependencies. The framework is
trained jointly with the primary task (audio-driven facial animation) and its
dual task (lip reading) and shares common audio/motion encoder components. Our
joint training framework facilitates more efficient data usage by leveraging
information from both tasks and explicitly capitalizing on the complementary
relationship between facial motion and audio to improve performance.
Furthermore, we introduce an auxiliary cross-modal consistency loss to mitigate
the potential over-smoothing underlying the cross-modal complementary
representations, enhancing the mapping of subtle facial expression dynamics.
Through extensive experiments and a perceptual user study conducted on the VOCA
and BIWI datasets, we demonstrate that our approach outperforms current
state-of-the-art methods both qualitatively and quantitatively. We have made
our code and video demonstrations available at
https://github.com/sabrina-su/iadf.git
Fully Automatic Facial Deformation Transfer
Facial Animation is a serious and ongoing challenge for the Computer Graphic industry.
Because diverse and complex emotions need to be expressed by different facial deformation and
animation, copying facial deformations from existing character to another is widely needed in both
industry and academia, to reduce time-consuming and repetitive manual work of modeling to create
the 3D shape sequences for every new character. But transfer of realistic facial animations between
two 3D models is limited and inconvenient for general use. Modern deformation transfer methods
require correspondences mapping, in most cases, which are tedious to get. In this paper, we present a
fast and automatic approach to transfer the deformations of the facial mesh models by obtaining the
3D point-wise correspondences in the automatic manner. The key idea is that we could estimate the
correspondences with different facial meshes using the robust facial landmark detection method by
projecting the 3D model to the 2D image. Experiments show that without any manual labelling efforts,
our method detects reliable correspondences faster and simpler compared with the state-of-the-art
automatic deformation transfer method on the facial models
- …