4,176 research outputs found
4DFAB: a large scale 4D facial expression database for biometric applications
The progress we are currently witnessing in many computer vision applications, including automatic face analysis, would not be made possible without tremendous efforts in collecting and annotating large scale visual databases. To this end, we propose 4DFAB, a new large scale database of dynamic high-resolution 3D faces (over 1,800,000 3D meshes). 4DFAB contains recordings of 180 subjects captured in four different sessions spanning over a five-year period. It contains 4D videos of subjects displaying both spontaneous and posed facial behaviours. The database can be used for both face and facial expression recognition, as well as behavioural biometrics. It can also be used to learn very powerful blendshapes for parametrising facial behaviour. In this paper, we conduct several experiments and demonstrate the usefulness of the database for various applications. The database will be made publicly available for research purposes
Objective Classes for Micro-Facial Expression Recognition
Micro-expressions are brief spontaneous facial expressions that appear on a
face when a person conceals an emotion, making them different to normal facial
expressions in subtlety and duration. Currently, emotion classes within the
CASME II dataset are based on Action Units and self-reports, creating conflicts
during machine learning training. We will show that classifying expressions
using Action Units, instead of predicted emotion, removes the potential bias of
human reporting. The proposed classes are tested using LBP-TOP, HOOF and HOG 3D
feature descriptors. The experiments are evaluated on two benchmark FACS coded
datasets: CASME II and SAMM. The best result achieves 86.35\% accuracy when
classifying the proposed 5 classes on CASME II using HOG 3D, outperforming the
result of the state-of-the-art 5-class emotional-based classification in CASME
II. Results indicate that classification based on Action Units provides an
objective method to improve micro-expression recognition.Comment: 11 pages, 4 figures and 5 tables. This paper will be submitted for
journal revie
Capture, Learning, and Synthesis of 3D Speaking Styles
Audio-driven 3D facial animation has been widely explored, but achieving
realistic, human-like performance is still unsolved. This is due to the lack of
available 3D datasets, models, and standard evaluation metrics. To address
this, we introduce a unique 4D face dataset with about 29 minutes of 4D scans
captured at 60 fps and synchronized audio from 12 speakers. We then train a
neural network on our dataset that factors identity from facial motion. The
learned model, VOCA (Voice Operated Character Animation) takes any speech
signal as input - even speech in languages other than English - and
realistically animates a wide range of adult faces. Conditioning on subject
labels during training allows the model to learn a variety of realistic
speaking styles. VOCA also provides animator controls to alter speaking style,
identity-dependent facial shape, and pose (i.e. head, jaw, and eyeball
rotations) during animation. To our knowledge, VOCA is the only realistic 3D
facial animation model that is readily applicable to unseen subjects without
retargeting. This makes VOCA suitable for tasks like in-game video, virtual
reality avatars, or any scenario in which the speaker, speech, or language is
not known in advance. We make the dataset and model available for research
purposes at http://voca.is.tue.mpg.de.Comment: To appear in CVPR 201
- …