Search CORE

29,803 research outputs found

Predicting continuous conflict perception with Bayesian Gaussian processes

Author: Filippone Maurizio
Kim Samuel
Valente Fabio
Vinciarelli Alessandro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Conflict is one of the most important phenomena of social life, but it is still largely neglected by the computing community. This work proposes an approach that detects common conversational social signals (loudness, overlapping speech, etc.) and predicts the conflict level perceived by human observers in continuous, non-categorical terms. The proposed regression approach is fully Bayesian and it adopts Automatic Relevance Determination to identify the social signals that influence most the outcome of the prediction. The experiments are performed over the SSPNet Conflict Corpus, a publicly available collection of 1430 clips extracted from televised political debates (roughly 12 hours of material for 138 subjects in total). The results show that it is possible to achieve a correlation close to 0.8 between actual and predicted conflict perception

Crossref

Enlighten

Fully Automatic Expression-Invariant Face Correspondence

Author: A Bronstein
A Elad
Augusto Salazar
B Allen
Chang Shu
D Jiang
D Liu
D Vlasic
E Learned-Miller
E Vezzetti
Flavio Prieto
G Passalis
H Li
H Li
H Rabiu
I Dryden
I Kakadiaris
J Guo
J Han
K Chang
L Xiaoguang
M Segundo
O Kaick van
P Nair
P Xi
S Berretti
S Wuhrer
Stefanie Wuhrer
T Cox
T Weise
Y Huang
Y Tong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

We consider the problem of computing accurate point-to-point correspondences among a set of human face scans with varying expressions. Our fully automatic approach does not require any manually placed markers on the scan. Instead, the approach learns the locations of a set of landmarks present in a database and uses this knowledge to automatically predict the locations of these landmarks on a newly available scan. The predicted landmarks are then used to compute point-to-point correspondences between a template model and the newly available scan. To accurately fit the expression of the template to the expression of the scan, we use as template a blendshape model. Our algorithm was tested on a database of human faces of different ethnic groups with strongly varying expressions. Experimental results show that the obtained point-to-point correspondence is both highly accurate and consistent for most of the tested 3D face models

arXiv.org e-Print Archive

CiteSeerX

NRC Publications Archive

Crossref

Unobtrusive and pervasive video-based eye-gaze tracking

Author: Camilleri Kenneth P.
Cristina Stefania
Publication venue: 'Elsevier BV'
Publication date: 01/05/2018
Field of study

Eye-gaze tracking has long been considered a desktop technology that finds its use inside the traditional office setting, where the operating conditions may be controlled. Nonetheless, recent advancements in mobile technology and a growing interest in capturing natural human behaviour have motivated an emerging interest in tracking eye movements within unconstrained real-life conditions, referred to as pervasive eye-gaze tracking. This critical review focuses on emerging passive and unobtrusive video-based eye-gaze tracking methods in recent literature, with the aim to identify different research avenues that are being followed in response to the challenges of pervasive eye-gaze tracking. Different eye-gaze tracking approaches are discussed in order to bring out their strengths and weaknesses, and to identify any limitations, within the context of pervasive eye-gaze tracking, that have yet to be considered by the computer vision community.peer-reviewe

OAR@UM

A survey on mouth modeling and analysis for Sign Language recognition

Author: Antonakos E
Roussos A
Zafeiriou S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/05/2015
Field of study

© 2015 IEEE.Around 70 million Deaf worldwide use Sign Languages (SLs) as their native languages. At the same time, they have limited reading/writing skills in the spoken language. This puts them at a severe disadvantage in many contexts, including education, work, usage of computers and the Internet. Automatic Sign Language Recognition (ASLR) can support the Deaf in many ways, e.g. by enabling the development of systems for Human-Computer Interaction in SL and translation between sign and spoken language. Research in ASLR usually revolves around automatic understanding of manual signs. Recently, ASLR research community has started to appreciate the importance of non-manuals, since they are related to the lexical meaning of a sign, the syntax and the prosody. Nonmanuals include body and head pose, movement of the eyebrows and the eyes, as well as blinks and squints. Arguably, the mouth is one of the most involved parts of the face in non-manuals. Mouth actions related to ASLR can be either mouthings, i.e. visual syllables with the mouth while signing, or non-verbal mouth gestures. Both are very important in ASLR. In this paper, we present the first survey on mouth non-manuals in ASLR. We start by showing why mouth motion is important in SL and the relevant techniques that exist within ASLR. Since limited research has been conducted regarding automatic analysis of mouth motion in the context of ALSR, we proceed by surveying relevant techniques from the areas of automatic mouth expression and visual speech recognition which can be applied to the task. Finally, we conclude by presenting the challenges and potentials of automatic analysis of mouth motion in the context of ASLR

CiteSeerX

Crossref

Spiral - Imperial College Digital Repository

Recognising facial expressions in video sequences

Author: A Gee
A Lanitis
B Fasel
B Raducanu
D DeCarlo
D DeCarlo
D Terzopoulos
Enrique Muñoz
FM Alkoot
G Hager
G Zhao
H Rowley
I Cohen
I Essa
I Matthews
J Ohya
JB Tenenbaum
JJ Lien
JM Buenaposada
JN Basili
José M. Buenaposada
K Mase
Luis Baumela
M Panti
M Pantic
M Rosenblum
M Turk
MF McTear
MJ Black
MJ Black
MJ Lyons
MKH Leung
N Oliver
P Ekman
P Ekman
P Rani
P Viola
PN Belhumeur
R Cowie
RO Duda
RW Picard
S Baker
S Roweis
T Cootes
T Ojala
Y Tian
Y Xu
Y Yacoob
Y Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

We introduce a system that processes a sequence of images of a front-facing human face and recognises a set of facial expressions. We use an efficient appearance-based face tracker to locate the face in the image sequence and estimate the deformation of its non-rigid components. The tracker works in real-time. It is robust to strong illumination changes and factors out changes in appearance caused by illumination from changes due to face deformation. We adopt a model-based approach for facial expression recognition. In our model, an image of a face is represented by a point in a deformation space. The variability of the classes of images associated to facial expressions are represented by a set of samples which model a low-dimensional manifold in the space of deformations. We introduce a probabilistic procedure based on a nearest-neighbour approach to combine the information provided by the incoming image sequence with the prior information stored in the expression manifold in order to compute a posterior probability associated to a facial expression. In the experiments conducted we show that this system is able to work in an unconstrained environment with strong changes in illumination and face location. It achieves an 89\% recognition rate in a set of 333 sequences from the Cohn-Kanade data base

CiteSeerX

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

3D face tracking and multi-scale, spatio-temporal analysis of linguistically significant facial expressions and head positions in ASL

Author: Liu Bo
Liu Jingjing
Metaxas Dimitris
Neidle Carol
Yu Xiang
Publication venue: EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA
Publication date: 01/01/2014
Field of study

Essential grammatical information is conveyed in signed languages by clusters of events involving facial expressions and movements of the head and upper body. This poses a significant challenge for computer-based sign language recognition. Here, we present new methods for the recognition of nonmanual grammatical markers in American Sign Language (ASL) based on: (1) new 3D tracking methods for the estimation of 3D head pose and facial expressions to determine the relevant low-level features; (2) methods for higher-level analysis of component events (raised/lowered eyebrows, periodic head nods and head shakes) used in grammatical markings—with differentiation of temporal phases (onset, core, offset, where appropriate), analysis of their characteristic properties, and extraction of corresponding features; (3) a 2-level learning framework to combine lowand high-level features of differing spatio-temporal scales. This new approach achieves significantly better tracking and recognition results than our previous methods

Boston University Institutional Repository (OpenBU)

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image

Author: Black Michael J.
Bolkart Timo
Choutas Vasileios
Ghorbani Nima
Osman Ahmed A. A.
Pavlakos Georgios
Tzionas Dimitrios
Publication venue
Publication date: 01/01/2019
Field of study

To facilitate the analysis of human actions, interactions and emotions, we compute a 3D model of human body pose, hand pose, and facial expression from a single monocular image. To achieve this, we use thousands of 3D scans to train a new, unified, 3D model of the human body, SMPL-X, that extends SMPL with fully articulated hands and an expressive face. Learning to regress the parameters of SMPL-X directly from images is challenging without paired images and 3D ground truth. Consequently, we follow the approach of SMPLify, which estimates 2D features and then optimizes model parameters to fit the features. We improve on SMPLify in several significant ways: (1) we detect 2D features corresponding to the face, hands, and feet and fit the full SMPL-X model to these; (2) we train a new neural network pose prior using a large MoCap dataset; (3) we define a new interpenetration penalty that is both fast and accurate; (4) we automatically detect gender and the appropriate body models (male, female, or neutral); (5) our PyTorch implementation achieves a speedup of more than 8x over Chumpy. We use the new method, SMPLify-X, to fit SMPL-X to both controlled images and images in the wild. We evaluate 3D accuracy on a new curated dataset comprising 100 images with pseudo ground-truth. This is a step towards automatic expressive human capture from monocular RGB data. The models, code, and data are available for research purposes at https://smpl-x.is.tue.mpg.de.Comment: To appear in CVPR 201

arXiv.org e-Print Archive

Crossref

MPG.PuRe