4,668 research outputs found
Relating Objective and Subjective Performance Measures for AAM-based Visual Speech Synthesizers
We compare two approaches for synthesizing visual speech using Active Appearance Models (AAMs): one that utilizes acoustic features as input, and one that utilizes a phonetic transcription as input. Both synthesizers are trained using the same data and the performance is measured using both objective and subjective testing. We investigate the impact of likely sources of error in the synthesized visual speech by introducing typical errors into real visual speech sequences and subjectively measuring the perceived degradation. When only a small region (e.g. a single syllable) of ground-truth visual speech is incorrect we find that the subjective score for the entire sequence is subjectively lower than sequences generated by our synthesizers. This observation motivates further consideration of an often ignored issue, which is to what extent are subjective measures correlated with objective measures of performance? Significantly, we find that the most commonly used objective measures of performance are not necessarily the best indicator of viewer perception of quality. We empirically evaluate alternatives and show that the cost of a dynamic time warp of synthesized visual speech parameters to the respective ground-truth parameters is a better indicator of subjective quality
Towards a comprehensive 3D dynamic facial expression database
Human faces play an important role in everyday life, including the expression of person identity,
emotion and intentionality, along with a range of biological functions. The human face has also become the
subject of considerable research effort, and there has been a shift towards understanding it using stimuli of
increasingly more realistic formats. In the current work, we outline progress made in the production of a
database of facial expressions in arguably the most realistic format, 3D dynamic. A suitable architecture for
capturing such 3D dynamic image sequences is described and then used to record seven expressions (fear,
disgust, anger, happiness, surprise, sadness and pain) by 10 actors at 3 levels of intensity (mild, normal and
extreme). We also present details of a psychological experiment that was used to formally evaluate the
accuracy of the expressions in a 2D dynamic format. The result is an initial, validated database for researchers
and practitioners. The goal is to scale up the work with more actors and expression types
Reverse Engineering Psychologically Valid Facial Expressions of Emotion into Social Robots
Social robots are now part of human society, destined for schools, hospitals, and homes to perform a variety of tasks. To engage their human users, social robots must be equipped with the essential social skill of facial expression communication. Yet, even state-of-the-art social robots are limited in this ability because they often rely on a restricted set of facial expressions derived from theory with well-known limitations such as lacking naturalistic dynamics. With no agreed methodology to objectively engineer a broader variance of more psychologically impactful facial expressions into the social robots' repertoire, human-robot interactions remain restricted. Here, we address this generic challenge with new methodologies that can reverse-engineer dynamic facial expressions into a social robot head. Our data-driven, user-centered approach, which combines human perception with psychophysical methods, produced highly recognizable and human-like dynamic facial expressions of the six classic emotions that generally outperformed state-of-art social robot facial expressions. Our data demonstrates the feasibility of our method applied to social robotics and highlights the benefits of using a data-driven approach that puts human users as central to deriving facial expressions for social robots. We also discuss future work to reverse-engineer a wider range of socially relevant facial expressions including conversational messages (e.g., interest, confusion) and personality traits (e.g., trustworthiness, attractiveness). Together, our results highlight the key role that psychology must continue to play in the design of social robots
Toward a Motor Theory of Sign Language Perception
Researches on signed languages still strongly dissociate lin- guistic issues
related on phonological and phonetic aspects, and gesture studies for
recognition and synthesis purposes. This paper focuses on the imbrication of
motion and meaning for the analysis, synthesis and evaluation of sign language
gestures. We discuss the relevance and interest of a motor theory of perception
in sign language communication. According to this theory, we consider that
linguistic knowledge is mapped on sensory-motor processes, and propose a
methodology based on the principle of a synthesis-by-analysis approach, guided
by an evaluation process that aims to validate some hypothesis and concepts of
this theory. Examples from existing studies illustrate the di erent concepts
and provide avenues for future work.Comment: 12 pages Partiellement financ\'e par le projet ANR SignCo
Recommended from our members
Highly automated method for facial expression synthesis
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The synthesis of realistic facial expressions has been an unexplored area for computer graphics scientists. Over the last three decades, several different construction methods have been formulated in order to obtain natural graphic results. Despite these advancements, though, current techniques still require costly resources, heavy user intervention and specific training and outcomes are still not completely realistic. This thesis, therefore, aims to achieve an automated synthesis that will produce realistic facial expressions at a low cost.
This thesis, proposes a highly automated approach for achieving a realistic facial
expression synthesis, which allows for enhanced performance in speed (3 minutes
processing time maximum) and quality with a minimum of user intervention. It will also demonstrate a highly technical and automated method of facial feature detection, by allowing users to obtain their desired facial expression synthesis with minimal
physical input. Moreover, it will describe a novel approach to the normalization of the
illumination settings values between source and target images, thereby allowing the
algorithm to work accurately, even in different lighting conditions.
Finally, we will present the results obtained from the proposed techniques, together with our conclusions, at the end of the paper
Detail-Preserving Controllable Deformation from Sparse Examples
published_or_final_versio
- âŠ