Search CORE

397 research outputs found

Artimate: an articulatory animation framework for audiovisual speech synthesis

Author: Ouni Slim
Steiner Ingmar
Publication venue
Publication date: 01/01/2012
Field of study

We present a modular framework for articulatory animation synthesis using speech motion capture data obtained with electromagnetic articulography (EMA). Adapting a skeletal animation approach, the articulatory motion data is applied to a three-dimensional (3D) model of the vocal tract, creating a portable resource that can be integrated in an audiovisual (AV) speech synthesis platform to provide realistic animation of the tongue and teeth for a virtual character. The framework also provides an interface to articulatory animation synthesis, as well as an example application to illustrate its use with a 3D game engine. We rely on cross-platform, open-source software and open standards to provide a lightweight, accessible, and portable workflow.Comment: Workshop on Innovation and Applications in Speech Technology (2012

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

Coarticulation and speech synchronization in MPEG-4 based facial animation

Author: Duarte RLP
El Rhalibi A
Merabti M
Publication venue: EMERALD GROUP PUBLISHING LIMITED
Publication date
Field of study

In this paper, we present a novel coarticulation and speech synchronization framework compliant with MPEG-4 facial animation. The system we have developed uses MPEG-4 facial animation standard and other development to enable the creation, editing and playback of high resolution 3D models; MPEG-4 animation streams; and is compatible with well-known related systems such as Greta and Xface. It supports text-to-speech for dynamic speech synchronization. The framework enables real-time model simplification using quadric-based surfaces. Our coarticulation approach provides realistic and high performance lip-sync animation, based on Cohen-Massaro’s model of coarticulation adapted to MPEG-4 facial animation (FA) specification. The preliminary experiments show that the coarticulation technique we have developed gives overall good and promising results when compared to related techniques

LJMU Research Online (Liverpool John Moores University)

Lip syncing method for realistic expressive 3D face model

Author: Ali IR
Alkawaz MH
Kolivand H
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Lip synchronization of 3D face model is now being used in a multitude of important fields. It brings a more human, social and dramatic reality to computer games, films and interactive multimedia, and is growing in use and importance. High level of realism can be used in demanding applications such as computer games and cinema. Authoring lip syncing with complex and subtle expressions is still difficult and fraught with problems in terms of realism. This research proposed a lip syncing method of realistic expressive 3D face model. Animated lips requires a 3D face model capable of representing the myriad shapes the human face experiences during speech and a method to produce the correct lip shape at the correct time. The paper presented a 3D face model designed to support lip syncing that align with input audio file. It deforms using Raised Cosine Deformation (RCD) function that is grafted onto the input facial geometry. The face model was based on MPEG-4 Facial Animation (FA) Standard. This paper proposed a method to animate the 3D face model over time to create animated lip syncing using a canonical set of visemes for all pairwise combinations of a reduced phoneme set called ProPhone. The proposed research integrated emotions by the consideration of Ekman model and Plutchik’s wheel with emotive eye movements by implementing Emotional Eye Movements Markup Language (EEMML) to produce realistic 3D face model. © 2017 Springer Science+Business Media New Yor

LJMU Research Online (Liverpool John Moores University)

Speech-driven Animation with Meaningful Behaviors

Author: Busso Carlos
Sadoughi Najmeh
Publication venue
Publication date: 04/08/2017
Field of study

Conversational agents (CAs) play an important role in human computer interaction. Creating believable movements for CAs is challenging, since the movements have to be meaningful and natural, reflecting the coupling between gestures and speech. Studies in the past have mainly relied on rule-based or data-driven approaches. Rule-based methods focus on creating meaningful behaviors conveying the underlying message, but the gestures cannot be easily synchronized with speech. Data-driven approaches, especially speech-driven models, can capture the relationship between speech and gestures. However, they create behaviors disregarding the meaning of the message. This study proposes to bridge the gap between these two approaches overcoming their limitations. The approach builds a dynamic Bayesian network (DBN), where a discrete variable is added to constrain the behaviors on the underlying constraint. The study implements and evaluates the approach with two constraints: discourse functions and prototypical behaviors. By constraining on the discourse functions (e.g., questions), the model learns the characteristic behaviors associated with a given discourse class learning the rules from the data. By constraining on prototypical behaviors (e.g., head nods), the approach can be embedded in a rule-based system as a behavior realizer creating trajectories that are timely synchronized with speech. The study proposes a DBN structure and a training approach that (1) models the cause-effect relationship between the constraint and the gestures, (2) initializes the state configuration models increasing the range of the generated behaviors, and (3) captures the differences in the behaviors across constraints by enforcing sparse transitions between shared and exclusive states per constraint. Objective and subjective evaluations demonstrate the benefits of the proposed approach over an unconstrained model.Comment: 13 pages, 12 figures, 5 table

arXiv.org e-Print Archive

A FACIAL ANIMATION FRAMEWORK WITH EMOTIVE/EXPRESSIVE CAPABILITIES

Author: Cosi Piero
Leone Giuseppe Riccardo
Publication venue: IADIS
Publication date
Field of study

LUCIA is an MPEG-4 facial animation system developed at ISTC-CNR.. It works on standard Facial Animation Parameters and speaks with the Italian version of FESTIVAL TTS. To achieve an emotive/expressive talking head LUCIA was build from real human data physically extracted by ELITE optotracking movement analyzer. LUCIA can copy a real human by reproducing the movements of passive markers positioned on his face and recorded by the ELITE device or can be driven by an emotional XML tagged input text, thus realizing a true audio/visual emotive/expressive synthesis. Synchronization between visual and audio data is very important in order to create the correct WAV and FAP files needed for the animation. LUCIA\u27s voice is based on the ISTC Italian version of FESTIVAL-MBROLA packages, modified by means of an appropriate APML/VSML tagged language. LUCIA is available in two different versions: an open source framework and the "work in progress" WebG

PUblication MAnagement

LUCIA: An open source 3D expressive avatar for multimodal h.m.i.

Author: Cosi Piero
Leone Giuseppe Riccardo
Paci Giulio
Publication venue: ICST
Publication date
Field of study

LUCIA is an MPEG-4 facial animation system developed at ISTC-CNR . It works on standard Facial Animation Parameters and speaks with the Italian version of FESTIVAL TTS. To achieve an emotive/expressive talking head LUCIA was build from real human data physically extracted by ELITE optotracking movement analyzer. LUCIA can copy a real human by reproducing the movements of passive markers positioned on his face and recorded by the ELITE device or can be driven by an emotional XML tagged input text, thus realizing a true audio/visual emotive/expressive synthesis. Synchronization between visual and audio data is very important in order to create the correct WAV and FAP files needed for the animation. LUCIA\u27s voice is based on the ISTC Italian version of FESTIVAL-MBROLA packages, modified by means of an appropriate APML/VSML tagged language. LUCIA is available in two dif-ferent versions: an open source framework and the "work in progress" WebGL

PUblication MAnagement

Expressive characters and a text chat interface

Author: Ballin D
Crabtree IB
Gillies M
Publication venue
Publication date: 01/01/2004
Field of study

UCL Discovery

Multispace behavioral model for face-based affective social agents

Author: Arya Ali
DiPaola Steve
Publication venue
Publication date: 01/01/2007
Field of study

This paper describes a behavioral model for affective social agents based on three independent but interacting parameter spaces: knowledge, personality, andmood. These spaces control a lower-level geometry space that provides parameters at the facial feature level. Personality and mood use findings in behavioral psychology to relate the perception of personality types and emotional states to the facial actions and expressions through two-dimensional models for personality and emotion. Knowledge encapsulates the tasks to be performed and the decision-making process using a specially designed XML-based language. While the geometry space provides an MPEG-4 compatible set of parameters for low-level control, the behavioral extensions available through the triple spaces provide flexible means of designing complicated personality types, facial expression, and dynamic interactive scenarios

Crossref

Carleton University's Institutional Repository

Springer - Publisher Connector

Directory of Open Access Journals

Simon Fraser University Institutional Repository

MPEG-4:Audio/Video and Synthetic Graphics/Audio for Real-Time

Author: Capin T
Doenges P
Lavagetto F
Ostermann J
Pandzic I.S
Petajan E
Publication venue
Publication date: 27/03/2007
Field of study

Infoscience - École polytechnique fédérale de Lausanne

FacEMOTE: Qualitative Parametric Modifiers for Facial Animations

Author: Badler Norman I
Byun Meeran
Publication venue: ScholarlyCommons
Publication date: 02/07/2002
Field of study

We propose a control mechanism for facial expressions by applying a few carefully chosen parametric modifications to preexisting expression data streams. This approach applies to any facial animation resource expressed in the general MPEG-4 form, whether taken from a library of preset facial expressions, captured from live performance, or entirely manually created. The MPEG-4 Facial Animation Parameters (FAPs) represent a facial expression as a set of parameterized muscle actions, given as intensity of individual muscle movements over time. Our system varies expressions by changing the intensities and scope of sets of MPEG-4 FAPs. It creates variations in “expressiveness” across the face model rather than simply scale, interpolate, or blend facial mesh node positions. The parameters are adapted from the Effort parameters of Laban Movement Analysis (LMA); we developed a mapping from their values onto sets of FAPs. The FacEMOTE parameters thus perturb a base expression to create a wide range of expressions. Such an approach could allow real-time face animations to change underlying speech or facial expression shapes dynamically according to current agent affect or user interaction needs

CiteSeerX

ScholarlyCommons@Penn