Predicting Head Pose in Dyadic Conversation

C Busso; Chuang Ding; E Bevacqua; I Matthews; J Allwood; J Cassell; JC Gower; K Haag; KG Munhall; L-P Morency; M Mori; N Ward; R Nishimura; S Hochreiter; T Watanabe

research

Predicting Head Pose in Dyadic Conversation

Authors: C Busso
Chuang Ding
E Bevacqua
I Matthews
J Allwood
J Cassell
JC Gower
K Haag
KG Munhall
L-P Morency
M Mori
N Ward
R Nishimura
S Hochreiter
T Watanabe
Publication date: 26 August 2017
Publisher: 'Springer Science and Business Media LLC'
Doi

Abstract

Natural movement plays a significant role in realistic speech animation. Numerous studies have demonstrated the contribution visual cues make to the degree we, as human observers, find an animation acceptable. Rigid head motion is one visual mode that universally co-occurs with speech, and so it is a reasonable strategy to seek features from the speech mode to predict the head pose. Several previous authors have shown that prediction is possible, but experiments are typically confined to rigidly produced dialogue. Expressive, emotive and prosodic speech exhibit motion patterns that are far more difficult to predict with considerable variation in expected head pose. People involved in dyadic conversation adapt speech and head motion in response to the others’ speech and head motion. Using Deep Bi-Directional Long Short Term Memory (BLSTM) neural networks, we demonstrate that it is possible to predict not just the head motion of the speaker, but also the head motion of the listener from the speech signal

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Crossref

info:doi/10.1007%2F978-3-319-6...

Last time updated on 06/08/2021

University of East Anglia digital repository

oai:ueaeprints.uea.ac.uk:64845

Last time updated on 21/11/2017