9 research outputs found

    Estimation of Confidence in the Dialogue based on Eye Gaze and Head Movement Information

    Get PDF
    In human-robot interaction, human mental states in dialogue have attracted attention to human-friendly robots that support educational use. Although estimating mental states using speech and visual information has been conducted, it is still challenging to estimate mental states more precisely in the educational scene. In this paper, we proposed a method to estimate human mental state based on participants’ eye gaze and head movement information. Estimated participants’ confidence levels in their answers to the miscellaneous knowledge question as a human mental state. The participants’ non-verbal information, such as eye gaze and head movements during dialog with a robot, were collected in our experiment using an eye-tracking device. Then we collect participants’ confidence levels and analyze the relationship between human mental state and non-verbal information. Furthermore, we also applied a machine learning technique to estimate participants’ confidence levels from extracted features of gaze and head movement information. As a result, the performance of a machine learning technique using gaze and head movements information achieved over 80 % accuracy in estimating confidence levels. Our research provides insight into developing a human-friendly robot considering human mental states in the dialogue

    Asian CHI symposium: HCI research from Asia and on Asian contexts and cultures

    Get PDF
    This symposium showcases the latest HCI work from Asia and those focusing on incorporating Asian sociocultural factors in their design and implementation. In addition to circulating ideas and envisioning future research in human-computer interaction, this symposium aims to foster social networks among academics (researchers and students) and practitioners and grow a research community from Asia

    Read the Room: Adapting a Robot's Voice to Ambient and Social Contexts

    Full text link
    Adapting one's voice to different ambient environments and social interactions is required for human social interaction. In robotics, the ability to recognize speech in noisy and quiet environments has received significant attention, but considering ambient cues in the production of social speech features has been little explored. Our research aims to modify a robot's speech to maximize acceptability in various social and acoustic contexts, starting with a use case for service robots in varying restaurants. We created an original dataset collected over Zoom with participants conversing in scripted and unscripted tasks given 7 different ambient sounds and background images. Voice conversion methods, in addition to altered Text-to-Speech that matched ambient specific data, were used for speech synthesis tasks. We conducted a subjective perception study that showed humans prefer synthetic speech that matches ambience and social context, ultimately preferring more human-like voices. This work provides three solutions to ambient and socially appropriate synthetic voices: (1) a novel protocol to collect real contextual audio voice data, (2) tools and directions to manipulate robot speech for appropriate social and ambient specific interactions, and (3) insight into voice conversion's role in flexibly altering robot speech to match different ambient environments.Comment: 8 page
    corecore