1 research outputs found
Introducing Representations of Facial Affect in Automated Multimodal Deception Detection
Automated deception detection systems can enhance health, justice, and
security in society by helping humans detect deceivers in high-stakes
situations across medical and legal domains, among others. This paper presents
a novel analysis of the discriminative power of dimensional representations of
facial affect for automated deception detection, along with interpretable
features from visual, vocal, and verbal modalities. We used a video dataset of
people communicating truthfully or deceptively in real-world, high-stakes
courtroom situations. We leveraged recent advances in automated emotion
recognition in-the-wild by implementing a state-of-the-art deep neural network
trained on the Aff-Wild database to extract continuous representations of
facial valence and facial arousal from speakers. We experimented with unimodal
Support Vector Machines (SVM) and SVM-based multimodal fusion methods to
identify effective features, modalities, and modeling approaches for detecting
deception. Unimodal models trained on facial affect achieved an AUC of 80%, and
facial affect contributed towards the highest-performing multimodal approach
(adaptive boosting) that achieved an AUC of 91% when tested on speakers who
were not part of training sets. This approach achieved a higher AUC than
existing automated machine learning approaches that used interpretable visual,
vocal, and verbal features to detect deception in this dataset, but did not use
facial affect. Across all videos, deceptive and truthful speakers exhibited
significant differences in facial valence and facial arousal, contributing
computational support to existing psychological theories on affect and
deception. The demonstrated importance of facial affect in our models informs
and motivates the future development of automated, affect-aware machine
learning approaches for modeling and detecting deception and other social
behaviors in-the-wild.Comment: 10 pages, Accepted at ACM International Conference on Multimodal
Interaction (ICMI), October 202