28,081 research outputs found
First impressions: A survey on vision-based apparent personality trait analysis
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed.Peer ReviewedPostprint (author's final draft
3D Face Synthesis Driven by Personality Impression
Synthesizing 3D faces that give certain personality impressions is commonly
needed in computer games, animations, and virtual world applications for
producing realistic virtual characters. In this paper, we propose a novel
approach to synthesize 3D faces based on personality impression for creating
virtual characters. Our approach consists of two major steps. In the first
step, we train classifiers using deep convolutional neural networks on a
dataset of images with personality impression annotations, which are capable of
predicting the personality impression of a face. In the second step, given a 3D
face and a desired personality impression type as user inputs, our approach
optimizes the facial details against the trained classifiers, so as to
synthesize a face which gives the desired personality impression. We
demonstrate our approach for synthesizing 3D faces giving desired personality
impressions on a variety of 3D face models. Perceptual studies show that the
perceived personality impressions of the synthesized faces agree with the
target personality impressions specified for synthesizing the faces. Please
refer to the supplementary materials for all results.Comment: 8pages;6 figure
Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction
Frame-level visual features are generally aggregated in time with the
techniques such as LSTM, Fisher Vectors, NetVLAD etc. to produce a robust
video-level representation. We here introduce a learnable aggregation technique
whose primary objective is to retain short-time temporal structure between
frame-level features and their spatial interdependencies in the representation.
Also, it can be easily adapted to the cases where there have very scarce
training samples. We evaluate the method on a real-fake expression prediction
dataset to demonstrate its superiority. Our method obtains 65% score on the
test dataset in the official MAP evaluation and there is only one misclassified
decision with the best reported result in the Chalearn Challenge (i.e. 66:7%) .
Lastly, we believe that this method can be extended to different problems such
as action/event recognition in future.Comment: Submitted to International Conference on Computer Vision Workshop
Deep Impression: Audiovisual Deep Residual Networks for Multimodal Apparent Personality Trait Recognition
Here, we develop an audiovisual deep residual network for multimodal apparent
personality trait recognition. The network is trained end-to-end for predicting
the Big Five personality traits of people from their videos. That is, the
network does not require any feature engineering or visual analysis such as
face detection, face landmark alignment or facial expression recognition.
Recently, the network won the third place in the ChaLearn First Impressions
Challenge with a test accuracy of 0.9109
Dimensional Affect and Expression in Natural and Mediated Interaction
There is a perceived controversy as to whether the cognitive representation
of affect is better modelled using a dimensional or categorical theory. This
paper first suggests that these views are, in fact, compatible. The paper then
discusses this theme and related issues in reference to a commonly stated
application domain of research on human affect and expression: human computer
interaction (HCI). The novel suggestion here is that a more realistic framing
of studies of human affect in expression with reference to HCI and,
particularly HCHI (Human-Computer-Human Interaction) entails some
re-formulation of the approach to the basic phenomena themselves. This theme is
illustrated with several examples from several recent research projects.Comment: Invited article presented at the 23rd Annual Meeting of the
International Society for Psychophysics, Tokyo, Japan, 20-23 October, 2007,
Proceedings of Fechner Day vol. 23 (2007
Emotion capture based on body postures and movements
In this paper we present a preliminary study for designing interactive
systems that are sensible to human emotions based on the body movements. To do
so, we first review the literature on the various approaches for defining and
characterizing human emotions. After justifying the adopted characterization
space for emotions, we then focus on the movement characteristics that must be
captured by the system for being able to recognize the human emotions.Comment: 22 page
A Mimetic Strategy to Engage Voluntary Physical Activity In Interactive Entertainment
We describe the design and implementation of a vision based interactive
entertainment system that makes use of both involuntary and voluntary control
paradigms. Unintentional input to the system from a potential viewer is used to
drive attention-getting output and encourage the transition to voluntary
interactive behaviour. The iMime system consists of a character animation
engine based on the interaction metaphor of a mime performer that simulates
non-verbal communication strategies, without spoken dialogue, to capture and
hold the attention of a viewer. The system was developed in the context of a
project studying care of dementia sufferers. Care for a dementia sufferer can
place unreasonable demands on the time and attentional resources of their
caregivers or family members. Our study contributes to the eventual development
of a system aimed at providing relief to dementia caregivers, while at the same
time serving as a source of pleasant interactive entertainment for viewers. The
work reported here is also aimed at a more general study of the design of
interactive entertainment systems involving a mixture of voluntary and
involuntary control.Comment: 6 pages, 7 figures, ECAG08 worksho
Robust Modeling of Epistemic Mental States
This work identifies and advances some research challenges in the analysis of
facial features and their temporal dynamics with epistemic mental states in
dyadic conversations. Epistemic states are: Agreement, Concentration,
Thoughtful, Certain, and Interest. In this paper, we perform a number of
statistical analyses and simulations to identify the relationship between
facial features and epistemic states. Non-linear relations are found to be more
prevalent, while temporal features derived from original facial features have
demonstrated a strong correlation with intensity changes. Then, we propose a
novel prediction framework that takes facial features and their nonlinear
relation scores as input and predict different epistemic states in videos. The
prediction of epistemic states is boosted when the classification of emotion
changing regions such as rising, falling, or steady-state are incorporated with
the temporal features. The proposed predictive models can predict the epistemic
states with significantly improved accuracy: correlation coefficient (CoERR)
for Agreement is 0.827, for Concentration 0.901, for Thoughtful 0.794, for
Certain 0.854, and for Interest 0.913.Comment: Accepted for Publication in Multimedia Tools and Application, Special
Issue: Socio-Affective Technologie
- …