Search CORE

599 research outputs found

A new visual speech modelling approach for visual speech recognition

Author: Ghita Ovidiu
Sutherland Alistair
Whelan Paul F.
Yu Dahai
Publication venue: 'Estonian Academy Publishers'
Publication date: 01/01/2012
Field of study

In this paper we propose a new learning-based representation that is referred to as Visual Speech Unit (VSU) for visual speech recognition (VSR). The new Visual Speech Unit concept proposes an extension of the standard viseme model that is currently applied for VSR by including in this representation not only the data associated with the visemes, but also the transitory information between consecutive visemes. The developed speech recognition system consists of several computational stages: (a) lips segmentation, (b) construction of the Expectation-Maximization Principal Component Analysis (EM-PCA) manifolds from the input video image, (c) registration between the models of the VSUs and the EM-PCA data constructed from the input image sequence and (d) recognition of the VSUs using a standard Hidden Markov Model (HMM) classification scheme. In this paper we were particularly interested to evaluate the classification accuracy obtained for our new VSU models when compared with that attained for standard (MPEG-4) viseme models. The experimental results indicate that we achieved 90% recognition rate when the system has been applied to the identification of 60 classes of VSUs, while the recognition rate for the standard set of MPEG-4 visemes was only 52%

Irish Universities

DCU Online Research Access Service

Audio-Visual Biometrics and Forgery

Author: Hanna Greige
Walid Karam
Publication venue: 'IntechOpen'
Publication date: 09/08/2011
Field of study

IntechOpen

Crossref

Coarticulation and speech synchronization in MPEG-4 based facial animation

Author: Duarte RLP
El Rhalibi A
Merabti M
Publication venue: EMERALD GROUP PUBLISHING LIMITED
Publication date
Field of study

In this paper, we present a novel coarticulation and speech synchronization framework compliant with MPEG-4 facial animation. The system we have developed uses MPEG-4 facial animation standard and other development to enable the creation, editing and playback of high resolution 3D models; MPEG-4 animation streams; and is compatible with well-known related systems such as Greta and Xface. It supports text-to-speech for dynamic speech synchronization. The framework enables real-time model simplification using quadric-based surfaces. Our coarticulation approach provides realistic and high performance lip-sync animation, based on Cohen-Massaro’s model of coarticulation adapted to MPEG-4 facial animation (FA) specification. The preliminary experiments show that the coarticulation technique we have developed gives overall good and promising results when compared to related techniques

LJMU Research Online (Liverpool John Moores University)

Facial Expression Recognition Using 3D Facial Feature Distances

Author: Hamit Soyel
Hasan Demirel
Publication venue: 'IntechOpen'
Publication date: 01/05/2008
Field of study

IntechOpen

Crossref

Speech-driven facial animation with realistic dynamics

Author: A. Bojorquez
A. Esposito
I. Rudomin
J.L. Castillo
O.N. Garcia
P.K. Kakumanu
R. Gutierrez-Osuna
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Lip syncing method for realistic expressive 3D face model

Author: Ali IR
Alkawaz MH
Kolivand H
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Lip synchronization of 3D face model is now being used in a multitude of important fields. It brings a more human, social and dramatic reality to computer games, films and interactive multimedia, and is growing in use and importance. High level of realism can be used in demanding applications such as computer games and cinema. Authoring lip syncing with complex and subtle expressions is still difficult and fraught with problems in terms of realism. This research proposed a lip syncing method of realistic expressive 3D face model. Animated lips requires a 3D face model capable of representing the myriad shapes the human face experiences during speech and a method to produce the correct lip shape at the correct time. The paper presented a 3D face model designed to support lip syncing that align with input audio file. It deforms using Raised Cosine Deformation (RCD) function that is grafted onto the input facial geometry. The face model was based on MPEG-4 Facial Animation (FA) Standard. This paper proposed a method to animate the 3D face model over time to create animated lip syncing using a canonical set of visemes for all pairwise combinations of a reduced phoneme set called ProPhone. The proposed research integrated emotions by the consideration of Ekman model and Plutchik’s wheel with emotive eye movements by implementing Emotional Eye Movements Markup Language (EEMML) to produce realistic 3D face model. © 2017 Springer Science+Business Media New Yor

LJMU Research Online (Liverpool John Moores University)

A MPEG-4 virtual human animation engine for interactive web based applications

Author: Gutierrez M.
Thalmann D.
Vexo F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/01/2007
Field of study

This paper presents a novel, MPEG-4 compliant animation engine (body player). It has been designed to synthesize virtual human full-body animations in interactive multimedia applications for the web. We believe that a full-body player can provide a more expressive and interesting interface than the use of animated faces only (talking heads). This is one of the first implementations of a MPEG-4 animation engine with deformable models (it uses the MPEG-4 Body Definition Parameters and Deformation Tables). Several potential applications are overviewed. This software tool was developed in the framework of the IST-INTERFACE European projec

Infoscience - École polytechnique fédérale de Lausanne