Search CORE

2 research outputs found

Spiking Neural Networks for Early Prediction in Human Robot Collaboration

Author: Wachs Juan P.
Zhou Tian
Publication venue
Publication date: 29/07/2018
Field of study

This paper introduces the Turn-Taking Spiking Neural Network (TTSNet), which is a cognitive model to perform early turn-taking prediction about human or agent's intentions. The TTSNet framework relies on implicit and explicit multimodal communication cues (physical, neurological and physiological) to be able to predict when the turn-taking event will occur in a robust and unambiguous fashion. To test the theories proposed, the TTSNet framework was implemented on an assistant robotic nurse, which predicts surgeon's turn-taking intentions and delivers surgical instruments accordingly. Experiments were conducted to evaluate TTSNet's performance in early turn-taking prediction. It was found to reach a F1 score of 0.683 given 10% of completed action, and a F1 score of 0.852 at 50% and 0.894 at 100% of the completed action. This performance outperformed multiple state-of-the-art algorithms, and surpassed human performance when limited partial observation is given (< 40%). Such early turn-taking prediction capability would allow robots to perform collaborative actions proactively, in order to facilitate collaboration and increase team efficiency.Comment: Under review for journa

arXiv.org e-Print Archive

Prediction of Turn-Taking by Combining Prosodic and Eye-Gaze Information in Poster Conversations

Author: Katsuya Takanashi
Takuma Iwatate
Tatsuya Kawahara
Publication venue
Publication date
Field of study

We investigate turn-taking behaviors in conversations in poster sessions. While the poster presenter holds most of the turns during sessions, the audience’s utterances are more important and should not be missed. In this paper, therefore, prediction of turn-taking by the audience is addressed. It is classified into two sub-tasks: prediction of speaker change and prediction of the next speaker. We made analysis on eye-gaze information and its relationship with turn-taking, introducing joint eye-gaze events by the presenter and audience. We also parameterize backchannel patterns of the audience. As a result of machine learning with these features, it is found that combination of prosodic features of the presenter and the joint eye-gaze features is effective for predicting speaker change, while eyegaze duration and backchannels preceding the speaker change are useful for predicting the next speaker among the audience. Index Terms: multi-party interaction, turn-taking, prosody, eye-gaz

CiteSeerX