Search CORE

86 research outputs found

Linear Facial Expression Transfer With Active Appearance Models

Author: Asthana Akshay
de la Hunty Miles
Goecke Roland
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/02/2016
Field of study

The issue of transferring facial expressions from one person's face to another's has been an area of interest for the movie industry and the computer graphics community for quite some time. In recent years, with the proliferation of online image and video collections and web applications, such as Google Street View, the question of preserving privacy through face de-identification has gained interest in the computer vision community. In this paper, we focus on the problem of real-time dynamic facial expression transfer using an Active Appearance Model framework. We provide a theoretical foundation for a generalisation of two well-known expression transfer methods and demonstrate the improved visual quality of the proposed linear extrapolation transfer method on examples of face swapping and expression transfer using the AVOZES data corpus. Realistic talking faces can be generated in real-time at low computational cost

The Australian National University

Learning weakly supervised multimodal phoneme embeddings

Author: Chaabouni Rahma
Dunbar Ewan
Dupoux Emmanuel
Zeghidour Neil
Publication venue
Publication date: 01/01/2017
Field of study

Recent works have explored deep architectures for learning multimodal speech representation (e.g. audio and images, articulation and audio) in a supervised way. Here we investigate the role of combining different speech modalities, i.e. audio and visual information representing the lips movements, in a weakly supervised way using Siamese networks and lexical same-different side information. In particular, we ask whether one modality can benefit from the other to provide a richer representation for phone recognition in a weakly supervised setting. We introduce mono-task and multi-task methods for merging speech and visual modalities for phone recognition. The mono-task learning consists in applying a Siamese network on the concatenation of the two modalities, while the multi-task learning receives several different combinations of modalities at train time. We show that multi-task learning enhances discriminability for visual and multimodal inputs while minimally impacting auditory inputs. Furthermore, we present a qualitative analysis of the obtained phone embeddings, and show that cross-modal visual input can improve the discriminability of phonological features which are visually discernable (rounding, open/close, labial place of articulation), resulting in representations that are closer to abstract linguistic features than those based on audio only

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Automatic emotional state detection using facial expression dynamic in videos

Author: Huang D
Meng H
Publication venue: TAETI ACADEMIC PUBLISHER
Publication date: 01/01/2014
Field of study

In this paper, an automatic emotion detection system is built for a computer or machine to detect the emotional state from facial expressions in human computer communication. Firstly, dynamic motion features are extracted from facial expression videos and then advanced machine learning methods for classification and regression are used to predict the emotional states. The system is evaluated on two publicly available datasets, i.e. GEMEP_FERA and AVEC2013, and satisfied performances are achieved in comparison with the baseline results provided. With this emotional state detection capability, a machine can read the facial expression of its user automatically. This technique can be integrated into applications such as smart robots, interactive games and smart surveillance systems

Crossref

Directory of Open Access Journals

Brunel University Research Archive

Learning based automatic face annotation for arbitrary poses and expressions from frontal images only

Author: Asthana A
Gedeon T
Goecke R
Quadrianto N
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Statistical approaches for building non-rigid deformable models, such as the active appearance model (AAM), have enjoyed great popularity in recent years, but typically require tedious manual annotation of training images. In this paper, a learning based approach for the automatic annotation of visually deformable objects from a single annotated frontal image is presented and demonstrated on the example of automatically annotating face images that can be used for building AAMs for fitting and tracking. This approach employs the idea of initially learning the correspondences between landmarks in a frontal image and a set of training images with a face in arbitrary poses. Using this learner, virtual images of unseen faces at any arbitrary pose for which the learner was trained can be reconstructed by predicting the new landmark locations and warping the texture from the frontal image. View-based AAMs are then built from the virtual images and used for automatically annotating unseen images, including images of different facial expressions, at any random pose within the maximum range spanned by the virtually reconstructed images. The approach is experimentally validated by automatically annotating face images from three different databases

CiteSeerX

Crossref

University of Canberra Research Repository

Sussex Research Online

CUED - Cambridge University Engineering Department

An Annotated Dataset of 14 Meat Images

Author: Stegmann Mikkel Bille
Publication venue
Publication date: 01/01/2002
Field of study

Online Research Database In Technology

GAGAN: Geometry-Aware Generative Adversarial Networks

Author: Kossaifi Jean
Panagakis Yannis
Pantic Maja
Tran Linh
Publication venue
Publication date: 27/03/2018
Field of study

Deep generative models learned through adversarial training have become increasingly popular for their ability to generate naturalistic image textures. However, aside from their texture, the visual appearance of objects is significantly influenced by their shape geometry; information which is not taken into account by existing generative models. This paper introduces the Geometry-Aware Generative Adversarial Networks (GAGAN) for incorporating geometric information into the image generation process. Specifically, in GAGAN the generator samples latent variables from the probability space of a statistical shape model. By mapping the output of the generator to a canonical coordinate frame through a differentiable geometric transformation, we enforce the geometry of the objects and add an implicit connection from the prior to the generated object. Experimental results on face generation indicate that the GAGAN can generate realistic images of faces with arbitrary facial attributes such as facial expression, pose, and morphology, that are of better quality than current GAN-based methods. Our method can be used to augment any existing GAN architecture and improve the quality of the images generated

arXiv.org e-Print Archive

Crossref

Automated Analysis of Corpora Callosa

Author: Davies Rhodri H.
Stegmann Mikkel Bille
Publication venue
Publication date: 01/01/2003
Field of study

Abstract. This report describes and evaluates the steps needed to perform modern model-based interpretation of the corpus callosum in MRI. The process is discussed from the initial landmark-free contours to fullfledged statistical models based on the Active Appearance Models framework. Topics treated include landmark placement, background modelling and multi-resolution analysis. Preliminary quantitative and qualitative validation in a cross-sectional study show that fully automated analysis and segmentation of the corpus callosum are feasible.

CiteSeerX

Online Research Database In Technology

3D facial geometric features for constrained local model

Author: Asthana Akshay
Asthana Ashish
Cheng Shiyang
Pantic Maja
Zafeiriou Stefanos
Publication venue: IEEE Computer Society
Publication date: 01/01/2014
Field of study

We propose a 3D Constrained Local Model framework for deformable face alignment in depth image. Our framework exploits the intrinsic 3D geometric information in depth data by utilizing robust histogram-based 3D geometric features that are based on normal vectors. In addition, we demonstrate the fusion of intensity data and 3D features that further improves the facial landmark localization accuracy. The experiments are conducted on publicly available FRGC database. The results show that our 3D features based CLM completely outperforms the raw depth features based CLM in term of fitting accuracy and robustness, and the fusion of intensity and 3D depth feature further improves the performance. Another benefit is that the proposed 3D features in our framework do not require any pre-processing procedure on the data

Crossref

University of Twente Research Information