8 research outputs found

    A framework for automatic and perceptually valid facial expression generation

    Get PDF
    Facial expressions are facial movements reflecting the internal emotional states of a character or in response to social communications. Realistic facial animation should consider at least two factors: believable visual effect and valid facial movements. However, most research tends to separate these two issues. In this paper, we present a framework for generating 3D facial expressions considering both the visual the dynamics effect. A facial expression mapping approach based on local geometry encoding is proposed, which encodes deformation in the 1-ring vector. This method is capable of mapping subtle facial movements without considering those shape and topological constraints. Facial expression mapping is achieved through three steps: correspondence establishment, deviation transfer and movement mapping. Deviation is transferred to the conformal face space through minimizing the error function. This function is formed by the source neutral and the deformed face model related by those transformation matrices in 1-ring neighborhood. The transformation matrix in 1-ring neighborhood is independent of the face shape and the mesh topology. After the facial expression mapping, dynamic parameters are then integrated with facial expressions for generating valid facial expressions. The dynamic parameters were generated based on psychophysical methods. The efficiency and effectiveness of the proposed methods have been tested using various face models with different shapes and topological representations

    THREE DIMENSIONAL MODELING AND ANIMATION OF FACIAL EXPRESSIONS

    Get PDF
    Facial expression and animation are important aspects of the 3D environment featuring human characters. These animations are frequently used in many kinds of applications and there have been many efforts to increase the realism. Three aspects are still stimulating active research: the detailed subtle facial expressions, the process of rigging a face, and the transfer of an expression from one person to another. This dissertation focuses on the above three aspects. A system for freely designing and creating detailed, dynamic, and animated facial expressions is developed. The presented pattern functions produce detailed and animated facial expressions. The system produces realistic results with fast performance, and allows users to directly manipulate it and see immediate results. Two unique methods for generating real-time, vivid, and animated tears have been developed and implemented. One method is for generating a teardrop that continually changes its shape as the tear drips down the face. The other is for generating a shedding tear, which is a kind of tear that seamlessly connects with the skin as it flows along the surface of the face, but remains an individual object. The methods both broaden CG and increase the realism of facial expressions. A new method to automatically set the bones on facial/head models to speed up the rigging process of a human face is also developed. To accomplish this, vertices that describe the face/head as well as relationships between each part of the face/head are grouped. The average distance between pairs of vertices is used to place the head bones. To set the bones in the face with multi-density, the mean value of the vertices in a group is measured. The time saved with this method is significant. A novel method to produce realistic expressions and animations by transferring an existing expression to a new facial model is developed. The approach is to transform the source model into the target model, which then has the same topology as the source model. The displacement vectors are calculated. Each vertex in the source model is mapped to the target model. The spatial relationships of each mapped vertex are constrained

    Analysis and Construction of Engaging Facial Forms and Expressions: Interdisciplinary Approaches from Art, Anatomy, Engineering, Cultural Studies, and Psychology

    Get PDF
    The topic of this dissertation is the anatomical, psychological, and cultural examination of a human face in order to effectively construct an anatomy-driven 3D virtual face customization and action model. In order to gain a broad perspective of all aspects of a face, theories and methodology from the fields of art, engineering, anatomy, psychology, and cultural studies have been analyzed and implemented. The computer generated facial customization and action model were designed based on the collected data. Using this customization system, culturally-specific attractive face in Korean popular culture, “kot-mi-nam (flower-like beautiful guy),” was modeled and analyzed as a case study. The “kot-mi-nam” phenomenon is overviewed in textual, visual, and contextual aspects, which reveals the gender- and sexuality-fluidity of its masculinity. The analysis and the actual development of the model organically co-construct each other requiring an interwoven process. Chapter 1 introduces anatomical studies of a human face, psychological theories of face recognition and an attractive face, and state-of-the-art face construction projects in the various fields. Chapter 2 and 3 present the Bezier curve-based 3D facial customization (BCFC) and Multi-layered Facial Action Model (MFAF) based on the analysis of human anatomy, to achieve a cost-effective yet realistic quality of facial animation without using 3D scanned data. In the experiments, results for the facial customization for gender, race, fat, and age showed that BCFC achieved enhanced performance of 25.20% compared to existing program Facegen , and 44.12% compared to Facial Studio. The experimental results also proved the realistic quality and effectiveness of MFAM compared with blend shape technique by enhancing 2.87% and 0.03% of facial area for happiness and anger expressions per second, respectively. In Chapter 4, according to the analysis based on BCFC, the 3D face of an average kot-mi-nam is close to gender neutral (male: 50.38%, female: 49.62%), and Caucasian (66.42-66.40%). Culturally-specific images can be misinterpreted in different cultures, due to their different languages, histories, and contexts. This research demonstrates that facial images can be affected by the cultural tastes of the makers and can also be interpreted differently by viewers in different cultures

    Facial Modelling and animation trends in the new millennium : a survey

    Get PDF
    M.Sc (Computer Science)Facial modelling and animation is considered one of the most challenging areas in the animation world. Since Parke and Waters’s (1996) comprehensive book, no major work encompassing the entire field of facial animation has been published. This thesis covers Parke and Waters’s work, while also providing a survey of the developments in the field since 1996. The thesis describes, analyses, and compares (where applicable) the existing techniques and practices used to produce the facial animation. Where applicable, the related techniques are grouped in the same chapter and described in a chronological fashion, outlining their differences, as well as their advantages and disadvantages. The thesis is concluded by exploratory work towards a talking head for Northern Sotho. Facial animation and lip synchronisation of a fragment of Northern Sotho is done by using software tools primarily designed for English.Computin

    An affective interface for conveying student feedback

    Get PDF
    In the present information age, decision-makers and modern society in general are challenged by the need to effectively handle large amounts of interrelated data obtained via electronic means. This thesis attempts to addresses the need for more effective data analysis and interpretation for decision-making. In particular, the study investigates whether virtual facial expressions (FEs) can be effectively applied as a non-verbal means to convey student feedback ‘at-a-glance’ and accurately with regard to affective content. This research has a threefold aim: (i) to handle the complex nature of multi-criteria type feedback data; (ii) map the feedback data into appropriate FEs and (iii) represent the data using a non-verbal affective interface. The approach adapted is such that the two-dimensional Kano model of satisfaction is established to evaluate feedback data in accord with multiple criteria; based on this, an aggregate score is generated that best represents the student feedback. Facial expressions of emotion are mapped to one-dimensional scales and the two-dimensional satisfaction space using psychophysical methods; mappings used to convert multi-criteria based student satisfaction ratings onto a pictorial representation in the form of cartoon facial expressions. A proof-of-concept prototype of an affective interface is developed and evaluated in terms of accuracy of the proposed non-verbal feedback analysis approach. The main findings of this study are that multi-criteria evaluation that takes into account two-dimensional quality can produce measures of satisfaction significantly correlated with manual rating. Student feedback can be conveyed accurately using virtual FEs provided that the multi-criteria analysis has been successful. Use of FEs to convey student feedback is faster than conventional feedback display modes

    New method for mathematical modelling of human visual speech

    Get PDF
    Audio-visual speech recognition and visual speech synthesisers are used as interfaces between humans and machines. Such interactions specifically rely on the analysis and synthesis of both audio and visual information, which humans use for face-to-face communication. Currently, there is no global standard to describe these interactions nor is there a standard mathematical tool to describe lip movements. Furthermore, the visual lip movement for each phoneme is considered in isolation rather than a continuation from one to another. Consequently, there is no globally accepted standard method for representing lip movement during articulation. This thesis addresses these issues by designing a transcribed group of words, by mathematical formulas, and so introducing the concept of a visual word, allocating signatures to visual words and finally building a visual speech vocabulary database. In addition, visual speech information has been analysed in a novel way by considering both lip movements and phonemic structure of the English language. In order to extract the visual data, three visual features on the lip have been chosen; these are on the outer upper, lower and corner of the lip. The extracted visual data during articulation is called the visual speech sample set. The final visual data is obtained after processing the visual speech sample sets to correct experimented artefacts such as head tilting, which happened during articulation and visual data extraction. The ‘Barycentric Lagrange Interpolation’ (BLI) formulates the visual speech sample sets into visual speech signals. The visual word is defined in this work and consists of the variation of three visual features. Further processing on relating the visual speech signals to the uttered word leads to the allocation of signatures that represent the visual word. This work suggests the visual word signature can be used either as a ‘visual word barcode’, a ‘digital visual word’ or a ‘2D/3D representations’. The 2D version of the visual word provides a unique signature that allows the identification of the words being uttered. In addition, identification of visual words has also been performed using a technique called ‘volumetric representations of the visual words’. Furthermore, the effect of altering the amplitudes and sampling rate for BLI has been evaluated. In addition, the performance of BLI in reconstructing the visual speech sample sets has been considered. Finally, BLI has been compared to signal reconstruction approach by RMSE and correlation coefficients. The results show that the BLI is the more reliable method for the purpose of this work according to Section 7.7

    New method for mathematical modelling of human visual speech

    Get PDF
    Audio-visual speech recognition and visual speech synthesisers are used as interfaces between humans and machines. Such interactions specifically rely on the analysis and synthesis of both audio and visual information, which humans use for face-to-face communication. Currently, there is no global standard to describe these interactions nor is there a standard mathematical tool to describe lip movements. Furthermore, the visual lip movement for each phoneme is considered in isolation rather than a continuation from one to another. Consequently, there is no globally accepted standard method for representing lip movement during articulation. This thesis addresses these issues by designing a transcribed group of words, by mathematical formulas, and so introducing the concept of a visual word, allocating signatures to visual words and finally building a visual speech vocabulary database. In addition, visual speech information has been analysed in a novel way by considering both lip movements and phonemic structure of the English language. In order to extract the visual data, three visual features on the lip have been chosen; these are on the outer upper, lower and corner of the lip. The extracted visual data during articulation is called the visual speech sample set. The final visual data is obtained after processing the visual speech sample sets to correct experimented artefacts such as head tilting, which happened during articulation and visual data extraction. The ‘Barycentric Lagrange Interpolation’ (BLI) formulates the visual speech sample sets into visual speech signals. The visual word is defined in this work and consists of the variation of three visual features. Further processing on relating the visual speech signals to the uttered word leads to the allocation of signatures that represent the visual word. This work suggests the visual word signature can be used either as a ‘visual word barcode’, a ‘digital visual word’ or a ‘2D/3D representations’. The 2D version of the visual word provides a unique signature that allows the identification of the words being uttered. In addition, identification of visual words has also been performed using a technique called ‘volumetric representations of the visual words’. Furthermore, the effect of altering the amplitudes and sampling rate for BLI has been evaluated. In addition, the performance of BLI in reconstructing the visual speech sample sets has been considered. Finally, BLI has been compared to signal reconstruction approach by RMSE and correlation coefficients. The results show that the BLI is the more reliable method for the purpose of this work according to Section 7.7
    corecore