Search CORE

28,415 research outputs found

Every Smile is Unique: Landmark-Guided Diverse Smile Generation

Author: Alameda-Pineda Xavier
Fua Pascal
Ricci Elisa
Sebe Nicu
Wang Wei
Xu Dan
Publication venue
Publication date: 01/01/2018
Field of study

Each smile is unique: one person surely smiles in different ways (e.g., closing/opening the eyes or mouth). Given one input image of a neutral face, can we generate multiple smile videos with distinctive characteristics? To tackle this one-to-many video generation problem, we propose a novel deep learning architecture named Conditional Multi-Mode Network (CMM-Net). To better encode the dynamics of facial expressions, CMM-Net explicitly exploits facial landmarks for generating smile sequences. Specifically, a variational auto-encoder is used to learn a facial landmark embedding. This single embedding is then exploited by a conditional recurrent network which generates a landmark embedding sequence conditioned on a specific expression (e.g., spontaneous smile). Next, the generated landmark embeddings are fed into a multi-mode recurrent landmark generator, producing a set of landmark sequences still associated to the given smile class but clearly distinct from each other. Finally, these landmark sequences are translated into face videos. Our experimental results demonstrate the effectiveness of our CMM-Net in generating realistic videos of multiple smile expressions.Comment: Accepted as a poster in Conference on Computer Vision and Pattern Recognition (CVPR), 201

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Hal - Université Grenoble Alpes

Archivio della ricerca - Fondazione Bruno Kessler

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Automated drowsiness detection for improved driving safety

Author: Bartlett Marian
Cetin Mujdat
Ercil Aytul
Erçil Aytül
Littlewort Gwen
Movellan Javier
Vural Esra
Çetin Müjdat
Publication venue: Ford Otosan
Publication date: 01/11/2008
Field of study

Several approaches were proposed for the detection and prediction of drowsiness. The approaches can be categorized as estimating the ﬁtness of duty, modeling the sleep-wake rhythms, measuring the vehicle based performance and online operator monitoring. Computer vision based online operator monitoring approach has become prominent due to its predictive ability of detecting drowsiness. Previous studies with this approach detect driver drowsiness primarily by making preassumptions about the relevant behavior, focusing on blink rate, eye closure, and yawning. Here we employ machine learning to datamine actual human behavior during drowsiness episodes. Automatic classiﬁers for 30 facial actions from the Facial Action Coding system were developed using machine learning on a separate database of spontaneous expressions. These facial actions include blinking and yawn motions, as well as a number of other facial movements. In addition, head motion was collected through automatic eye tracking and an accelerometer. These measures were passed to learning-based classiﬁers such as Adaboost and multinomial ridge regression. The system was able to predict sleep and crash episodes during a driving computer game with 96% accuracy within subjects and above 90% accuracy across subjects. This is the highest prediction rate reported to date for detecting real drowsiness. Moreover, the analysis revealed new information about human behavior during drowsy drivin

Sabanci University Research Database

Reproducibility of the dynamics of facial expressions in unilateral facial palsy

Author: Alagha Mahmoud Amir
Ayoub Ashraf
Ju Xiangyang
Morley Stephen
Publication venue: 'Elsevier BV'
Publication date: 01/02/2018
Field of study

The aim of this study was to assess the reproducibility of non-verbal facial expressions in unilateral facial paralysis using dynamic four-dimensional (4D) imaging. The Di4D system was used to record five facial expressions of 20 adult patients. The system captured 60 three-dimensional (3D) images per second; each facial expression took 3–4 seconds which was recorded in real time. Thus a set of 180 3D facial images was generated for each expression. The procedure was repeated after 30 min to assess the reproducibility of the expressions. A mathematical facial mesh consisting of thousands of quasi-point ‘vertices’ was conformed to the face in order to determine the morphological characteristics in a comprehensive manner. The vertices were tracked throughout the sequence of the 180 images. Five key 3D facial frames from each sequence of images were analyzed. Comparisons were made between the first and second capture of each facial expression to assess the reproducibility of facial movements. Corresponding images were aligned using partial Procrustes analysis, and the root mean square distance between them was calculated and analyzed statistically (paired Student ttest, P < 0.05). Facial expressions of lip purse, cheek puff, and raising of eyebrows were reproducible. Facial expressions of maximum smile and forceful eye closure were not reproducible. The limited coordination of various groups of facial muscles contributed to the lack of reproducibility of these facial expressions. 4D imaging is a useful clinical tool for the assessment of facial expressions

Crossref

Enlighten

Controllable Image-to-Video Translation: A Case Study on Facial Expression Generation

Author: Fan Lijie
Gan Chuang
Gong Boqing
Huang Junzhou
Huang Wenbing
Publication venue
Publication date: 08/08/2018
Field of study

The recent advances in deep learning have made it possible to generate photo-realistic images by using neural networks and even to extrapolate video frames from an input video clip. In this paper, for the sake of both furthering this exploration and our own interest in a realistic application, we study image-to-video translation and particularly focus on the videos of facial expressions. This problem challenges the deep neural networks by another temporal dimension comparing to the image-to-image translation. Moreover, its single input image fails most existing video generation methods that rely on recurrent models. We propose a user-controllable approach so as to generate video clips of various lengths from a single face image. The lengths and types of the expressions are controlled by users. To this end, we design a novel neural network architecture that can incorporate the user input into its skip connections and propose several improvements to the adversarial training method for the neural network. Experiments and user studies verify the effectiveness of our approach. Especially, we would like to highlight that even for the face images in the wild (downloaded from the Web and the authors' own photos), our model can generate high-quality facial expression videos of which about 50\% are labeled as real by Amazon Mechanical Turk workers.Comment: 10 page

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Affective Facial Expression Processing via Simulation: A Probabilistic Model

Author: Boccignone Giuseppe
Johnston Benjamin
Vitale Jonathan
Williams Mary-Anne
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

Understanding the mental state of other people is an important skill for intelligent agents and robots to operate within social environments. However, the mental processes involved in `mind-reading' are complex. One explanation of such processes is Simulation Theory - it is supported by a large body of neuropsychological research. Yet, determining the best computational model or theory to use in simulation-style emotion detection, is far from being understood. In this work, we use Simulation Theory and neuroscience findings on Mirror-Neuron Systems as the basis for a novel computational model, as a way to handle affective facial expressions. The model is based on a probabilistic mapping of observations from multiple identities onto a single fixed identity (`internal transcoding of external stimuli'), and then onto a latent space (`phenomenological response'). Together with the proposed architecture we present some promising preliminary resultsComment: Annual International Conference on Biologically Inspired Cognitive Architectures - BICA 201

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

OPUS - University of Technology Sydney

LOMo: Latent Ordinal Model for Facial Analysis in Videos

Author: Bartlett Marian
Sharma Gaurav
Sikka Karan
Publication venue
Publication date: 01/01/2016
Field of study

We study the problem of facial analysis in videos. We propose a novel weakly supervised learning method that models the video event (expression, pain etc.) as a sequence of automatically mined, discriminative sub-events (eg. onset and offset phase for smile, brow lower and cheek raise for pain). The proposed model is inspired by the recent works on Multiple Instance Learning and latent SVM/HCRF- it extends such frameworks to model the ordinal or temporal aspect in the videos, approximately. We obtain consistent improvements over relevant competitive baselines on four challenging and publicly available video based facial analysis datasets for prediction of expression, clinical pain and intent in dyadic conversations. In combination with complimentary features, we report state-of-the-art results on these datasets.Comment: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR

arXiv.org e-Print Archive

MPG.PuRe