4,392 research outputs found
Automatic emotional state detection using facial expression dynamic in videos
In this paper, an automatic emotion detection system is built for a computer or machine to detect the emotional state from facial expressions in human computer communication. Firstly, dynamic motion features are extracted from facial expression videos and then advanced machine learning methods for classification and regression are used to predict the emotional states.
The system is evaluated on two publicly available datasets, i.e. GEMEP_FERA and AVEC2013, and satisfied performances are achieved in comparison with the baseline results provided. With this emotional state detection capability, a machine can read the facial expression of its user automatically. This technique can be integrated into applications such as smart robots, interactive games and smart surveillance systems
Discriminatively Trained Latent Ordinal Model for Video Classification
We study the problem of video classification for facial analysis and human
action recognition. We propose a novel weakly supervised learning method that
models the video as a sequence of automatically mined, discriminative
sub-events (eg. onset and offset phase for "smile", running and jumping for
"highjump"). The proposed model is inspired by the recent works on Multiple
Instance Learning and latent SVM/HCRF -- it extends such frameworks to model
the ordinal aspect in the videos, approximately. We obtain consistent
improvements over relevant competitive baselines on four challenging and
publicly available video based facial analysis datasets for prediction of
expression, clinical pain and intent in dyadic conversations and on three
challenging human action datasets. We also validate the method with qualitative
results and show that they largely support the intuitions behind the method.Comment: Paper accepted in IEEE TPAMI. arXiv admin note: substantial text
overlap with arXiv:1604.0150
Facial Expression Retargeting from Human to Avatar Made Easy
Facial expression retargeting from humans to virtual characters is a useful
technique in computer graphics and animation. Traditional methods use markers
or blendshapes to construct a mapping between the human and avatar faces.
However, these approaches require a tedious 3D modeling process, and the
performance relies on the modelers' experience. In this paper, we propose a
brand-new solution to this cross-domain expression transfer problem via
nonlinear expression embedding and expression domain translation. We first
build low-dimensional latent spaces for the human and avatar facial expressions
with variational autoencoder. Then we construct correspondences between the two
latent spaces guided by geometric and perceptual constraints. Specifically, we
design geometric correspondences to reflect geometric matching and utilize a
triplet data structure to express users' perceptual preference of avatar
expressions. A user-friendly method is proposed to automatically generate
triplets for a system allowing users to easily and efficiently annotate the
correspondences. Using both geometric and perceptual correspondences, we
trained a network for expression domain translation from human to avatar.
Extensive experimental results and user studies demonstrate that even
nonprofessional users can apply our method to generate high-quality facial
expression retargeting results with less time and effort.Comment: IEEE Transactions on Visualization and Computer Graphics (TVCG), to
appea
- …