9 research outputs found
Estimation of Confidence in the Dialogue based on Eye Gaze and Head Movement Information
In human-robot interaction, human mental states in dialogue have attracted attention to human-friendly robots that support educational use. Although estimating mental states using speech and visual information has been conducted, it is still challenging to estimate mental states more precisely in the educational scene. In this paper, we proposed a method to estimate human mental state based on participants’ eye gaze and head movement information. Estimated participants’ confidence levels in their answers to the miscellaneous knowledge question as a human mental state. The participants’ non-verbal information, such as eye gaze and head movements during dialog with a robot, were collected in our experiment using an eye-tracking device. Then we collect participants’ confidence levels and analyze the relationship between human mental state and non-verbal information. Furthermore, we also applied a machine learning technique to estimate participants’ confidence levels from extracted features of gaze and head movement information. As a result, the performance of a machine learning technique using gaze and head movements information achieved over 80 % accuracy in estimating confidence levels. Our research provides insight into developing a human-friendly robot considering human mental states in the dialogue
Asian CHI symposium: HCI research from Asia and on Asian contexts and cultures
This symposium showcases the latest HCI work from Asia and those focusing on incorporating Asian sociocultural factors in their design and implementation. In addition to circulating ideas and envisioning future research in human-computer interaction, this symposium aims to foster social networks among academics (researchers and students) and practitioners and grow a research community from Asia
Read the Room: Adapting a Robot's Voice to Ambient and Social Contexts
Adapting one's voice to different ambient environments and social
interactions is required for human social interaction. In robotics, the ability
to recognize speech in noisy and quiet environments has received significant
attention, but considering ambient cues in the production of social speech
features has been little explored. Our research aims to modify a robot's speech
to maximize acceptability in various social and acoustic contexts, starting
with a use case for service robots in varying restaurants. We created an
original dataset collected over Zoom with participants conversing in scripted
and unscripted tasks given 7 different ambient sounds and background images.
Voice conversion methods, in addition to altered Text-to-Speech that matched
ambient specific data, were used for speech synthesis tasks. We conducted a
subjective perception study that showed humans prefer synthetic speech that
matches ambience and social context, ultimately preferring more human-like
voices. This work provides three solutions to ambient and socially appropriate
synthetic voices: (1) a novel protocol to collect real contextual audio voice
data, (2) tools and directions to manipulate robot speech for appropriate
social and ambient specific interactions, and (3) insight into voice
conversion's role in flexibly altering robot speech to match different ambient
environments.Comment: 8 page