Search CORE

27 research outputs found

Hierarchical online appearance-based tracking for 3D head pose, eyebrows, lips, eyelids, and irises

Author: Gonzalez Garcia Jordi
Orozco Javier
Pantic Maja
Rudovic Ognjen
Publication venue: Elsevier
Publication date: 01/01/2013
Field of study

In this paper, we propose an On-line Appearance-Based Tracker (OABT) for simultaneous tracking of 3D head pose, lips, eyebrows, eyelids and irises in monocular video sequences. In contrast to previously proposed tracking approaches, which deal with face and gaze tracking separately, our OABT can also be used for eyelid and iris tracking, as well as 3D head pose, lips and eyebrows facial actions tracking. Furthermore, our approach applies an on-line learning of changes in the appearance of the tracked target. Hence, the prior training of appearance models, which usually requires a large amount of labeled facial images, is avoided. Moreover, the proposed method is built upon a hierarchical combination of three OABTs, which are optimized using a Levenberg–Marquardt Algorithm (LMA) enhanced with line-search procedures. This, in turn, makes the proposed method robust to changes in lighting conditions, occlusions and translucent textures, as evidenced by our experiments. Finally, the proposed method achieves head and facial actions tracking in real-time

University of Twente Research Information

Hierarchical eyelid and face tracking

Author: Gonzàlez Jordi
Orozco Francisco J.
Rius Ignasi
Roca Francesc Xavier
Publication venue: Springer Verlag
Publication date: 01/01/2007
Field of study

Most applications on Human Computer Interaction (HCI) require to extract the movements of user faces, while avoiding high memory and time expenses. Moreover, HCI systems usually use low-cost cameras, while current face tracking techniques strongly depend on the image resolution. In this paper, we tackle the problem of eyelid tracking by using Appearance-Based Models, thus achieving accurate estimations of the movements of the eyelids, while avoiding cues, which require high-resolution faces, such as edge detectors or colour information. Consequently, we can track the fast and spontaneous movements of the eyelids, a very hard task due to the small resolution of the eye regions. Subsequently, we combine the results of eyelid tracking with the estimations of other facial features, such as the eyebrows and the lips. As a result, a hierarchical tracking framework is obtained: we demonstrate that combining two appearance-based trackers allows to get accurate estimates for the eyelid, eyebrows, lips and also the 3D head pose by using low-cost video cameras and in real-time. Therefore, our approach is shown suitable to be used for further facial-expression analysis.Peer Reviewe

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Correlated-Spaces Regression for Learning Continuous Emotion Dimensions

Author: Nicolaou M.
Pantic M.
Zafeiriou S.
Publication venue: ACM
Publication date: 01/01/2013
Field of study

Adopting continuous dimensional annotations for affective analysis has been gaining rising attention by researchers over the past years. Due to the idiosyncratic nature of this problem, many subproblems have been identified, spanning from the fusion of multiple continuous annotations to exploiting output-correlations amongst emotion dimensions. In this paper, we firstly empirically answer several important questions which have found partial or no answer at all so far in related literature. In more detail, we study the correlation of each emotion dimension (i) with respect to other emotion dimensions, (ii) to basic emotions (e.g., happiness, anger). As a measure for comparison, we use video and audio features. Interestingly enough, we find that (i) each emotion dimension is more correlated with other emotion dimensions rather than with face and audio features, and similarly (ii) that each basic emotion is more correlated with emotion dimensions than with audio and video features. A similar conclusion holds for discrete emotions which are found to be highly correlated to emotion dimensions as compared to audio and/or video features. Motivated by these findings, we present a novel regression algorithm (Correlated-Spaces Regression, CSR), inspired by Canonical Correlation Analysis (CCA) which learns output-correlations and performs supervised dimensionality reduction and multimodal fusion by (i) projecting features extracted from all modalities and labels onto a common space where their inter-correlation is maximised and (ii) learning mappings from the projected feature space onto the projected, uncorrelated label space

University of Twente Research Information

Improved facial feature fitting for model based coding and animation

Author: Kuo Po Tsun Paul
Publication venue: The University of Edinburgh
Publication date: 01/01/2006
Field of study

EThOS - Electronic Theses Online ServiceGBUnited Kingdo

Edinburgh Research Archive

OpenGrey Repository

Correlated-spaces regression for learning continuous emotion dimensions

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2013
Field of study

Crossref

Timing is everything: A spatio-temporal approach to the analysis of facial actions

Author: Valstar Michel Francois
Valstar Michel Francois
Publication venue: Computing, Imperial College London
Publication date: 01/08/2008
Field of study

This thesis presents a fully automatic facial expression analysis system based on the Facial Action Coding System (FACS). FACS is the best known and the most commonly used system to describe facial activity in terms of facial muscle actions (i.e., action units, AUs). We will present our research on the analysis of the morphological, spatio-temporal and behavioural aspects of facial expressions. In contrast with most other researchers in the field who use appearance based techniques, we use a geometric feature based approach. We will argue that that approach is more suitable for analysing facial expression temporal dynamics. Our system is capable of explicitly exploring the temporal aspects of facial expressions from an input colour video in terms of their onset (start), apex (peak) and offset (end). The fully automatic system presented here detects 20 facial points in the first frame and tracks them throughout the video. From the tracked points we compute geometry-based features which serve as the input to the remainder of our systems. The AU activation detection system uses GentleBoost feature selection and a Support Vector Machine (SVM) classifier to find which AUs were present in an expression. Temporal dynamics of active AUs are recognised by a hybrid GentleBoost-SVM-Hidden Markov model classifier. The system is capable of analysing 23 out of 27 existing AUs with high accuracy. The main contributions of the work presented in this thesis are the following: we have created a method for fully automatic AU analysis with state-of-the-art recognition results. We have proposed for the first time a method for recognition of the four temporal phases of an AU. We have build the largest comprehensive database of facial expressions to date. We also present for the first time in the literature two studies for automatic distinction between posed and spontaneous expressions

Spiral - Imperial College Digital Repository

Robust Canonical Correlation Analysis: Audio-visual Fusion for Learning Continuous Interest

Author: Nicolaou Mihalis
Panagakis Yannis
Pantic Maja
Zafeiriou Stefanos
Publication venue
Publication date
Field of study

Goldsmiths Research Online

A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild"

Author: Antonakos Epameinondas
Asthana Akshay
Chrysos Grigorios G.
Snape Patrick
Zafeiriou Stefanos
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/02/2017
Field of study

Recently, technologies such as face detection, facial landmark localisation and face recognition and verification have matured enough to provide effective and efficient solutions for imagery captured under arbitrary conditions (referred to as "in-the-wild"). This is partially attributed to the fact that comprehensive "in-the-wild" benchmarks have been developed for face detection, landmark localisation and recognition/verification. A very important technology that has not been thoroughly evaluated yet is deformable face tracking "in-the-wild". Until now, the performance has mainly been assessed qualitatively by visually assessing the result of a deformable face tracking technology on short videos. In this paper, we perform the first, to the best of our knowledge, thorough evaluation of state-of-the-art deformable face tracking pipelines using the recently introduced 300VW benchmark. We evaluate many different architectures focusing mainly on the task of on-line deformable face tracking. In particular, we compare the following general strategies: (a) generic face detection plus generic facial landmark localisation, (b) generic model free tracking plus generic facial landmark localisation, as well as (c) hybrid approaches using state-of-the-art face detection, model free tracking and facial landmark localisation technologies. Our evaluation reveals future avenues for further research on the topic.Comment: E. Antonakos and P. Snape contributed equally and have joint second authorshi

arXiv.org e-Print Archive

Springer - Publisher Connector

Spiral - Imperial College Digital Repository

Organising a photograph collection based on human appearance

Author: Uscilowski Bartlomiej
Publication venue: Dublin City University. School of Electronic Engineering
Publication date: 01/11/2007
Field of study

This thesis describes a complete framework for organising digital photographs in an unsupervised manner, based on the appearance of people captured in the photographs. Organising a collection of photographs manually, especially providing the identities of people captured in photographs, is a time consuming task. Unsupervised grouping of images containing similar persons makes annotating names easier (as a group of images can be named at once) and enables quick search based on query by example. The full process of unsupervised clustering is discussed in this thesis. Methods for locating facial components are discussed and a technique based on colour image segmentation is proposed and tested. Additionally a method based on the Principal Component Analysis template is tested, too. These provide eye locations required for acquiring a normalised facial image. This image is then preprocessed by a histogram equalisation and feathering, and the features of MPEG-7 face recognition descriptor are extracted. A distance measure proposed in the MPEG-7 standard is used as a similarity measure. Three approaches to grouping that use only face recognition features for clustering are analysed. These are modified k-means, single-link and a method based on a nearest neighbour classifier. The nearest neighbour-based technique is chosen for further experiments with fusing information from several sources. These sources are context-based such as events (party, trip, holidays), the ownership of photographs, and content-based such as information about the colour and texture of the bodies of humans appearing in photographs. Two techniques are proposed for fusing event and ownership (user) information with the face recognition features: a Transferable Belief Model (TBM) and three level clustering. The three level clustering is carried out at “event” level, “user” level and “collection” level. The latter technique proves to be most efficient. For combining body information with the face recognition features, three probabilistic fusion methods are tested. These are the average sum, the generalised product and the maximum rule. Combinations are tested within events and within user collections. This work concludes with a brief discussion on extraction of key images for a representation of each cluster

Irish Universities

DCU Online Research Access Service