Explainable Human-centered Traits from Head Motion and Facial Expression Dynamics

Abstract

We explore the efficacy of multimodal behavioral cues for explainable prediction of personality and interview-specific traits. We utilize elementary head-motion units named kinemes, atomic facial movements termed action units and speech features to estimate these human-centered traits. Empirical results confirm that kinemes and action units enable discovery of multiple trait-specific behaviors while also enabling explainability in support of the predictions. For fusing cues, we explore decision and feature-level fusion, and an additive attention-based fusion strategy which quantifies the relative importance of the three modalities for trait prediction. Examining various long-short term memory (LSTM) architectures for classification and regression on the MIT Interview and First Impressions Candidate Screening (FICS) datasets, we note that: (1) Multimodal approaches outperform unimodal counterparts, achieving the highest PCC of 0.98 for Excited-Friendly traits in MIT and 0.57 for Extraversion in FICS; (2) Efficient trait predictions and plausible explanations are achieved with both unimodal and multimodal approaches, and (3) Following the thin-slice approach, effective trait prediction is achieved even from two-second behavioral snippets. Our implementation code is available at: https://github.com/deepsurbhi8/Explainable_Human_Traits_ Prediction.</p

Similar works

Full text

thumbnail-image

University of Canberra Research Repository

redirect
Last time updated on 14/11/2024

This paper was published in University of Canberra Research Repository.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.

Licence: info:eu-repo/semantics/openAccess