7,299 research outputs found
Recommended from our members
Reference and Gestures in Dialogue Generation: Three Studies with Embodied Conversational Agents
This paper reports on three studies into social presence cues which were carried out in the context of the NECA (Net-environment for Embodied Emotional Conversational Agents) project and the EPOCH network. The first study concerns the generation of referring expressions. We adopted an existing algorithm for generating referring expressions such that it could run according to an egocentric and a neutral strategy. In an evaluation study, we found that the two strategies were correlated with the perceived friendliness of the speaker. In the second and the third study, we evaluated the gestures that were generated by the NECA system. In this paper, we briefly summarize the most salient results of these two studies. They concern the effect of gestures on perceived quality of speech and information retention
On the simulation of interactive non-verbal behaviour in virtual humans
Development of virtual humans has focused mainly in two broad areas - conversational agents and computer game characters. Computer game characters have traditionally been action-oriented - focused on the game-play - and conversational agents have been focused on sensible/intelligent conversation. While virtual humans have incorporated some form of non-verbal behaviour, this has been quite limited and more importantly not connected or connected very loosely with the behaviour of a real human interacting with the virtual human - due to a lack of sensor data and no system to respond to that data. The interactional aspect of non-verbal behaviour is highly important in human-human interactions and previous research has demonstrated that people treat media (and therefore virtual humans) as real people, and so interactive non-verbal behaviour is also important in the development of virtual humans. This paper presents the challenges in creating virtual humans that are non-verbally interactive and drawing corollaries with the development history of control systems in robotics presents some approaches to solving these challenges - specifically using behaviour based systems - and shows how an order of magnitude increase in response time of virtual humans in conversation can be obtained and that the development of rapidly responding non-verbal behaviours can start with just a few behaviours with more behaviours added without difficulty later in development
A Framework of Personality Cues for Conversational Agents
Conversational agents (CAs)—software systems emulating conversations with humans through natural language—reshape our communication environment. As CAs have been widely used for applications requiring human-like interactions, a key goal in information systems (IS) research and practice is to be able to create CAs that exhibit a particular personality. However, existing research on CA personality is scattered across different fields and researchers and practitioners face difficulty in understanding the current state of the art on the design of CA personality. To address this gap, we systematically analyze existing studies and develop a framework on how to imbue CAs with personality cues and how to organize the underlying range of expressive variation regarding the Big Five personality traits. Our framework contributes to IS research by providing an overview of CA personality cues in verbal and non-verbal language and supports practitioners in designing CAs with a particular personality
Investigating How Speech And Animation Realism Influence The Perceived Personality Of Virtual Characters And Agents
The portrayed personality of virtual characters and agents is understood to influence how we perceive and engage with digital applications. Understanding how the features of speech and animation drive portrayed personality allows us to intentionally design characters to be more personalized and engaging. In this study, we use performance capture data of unscripted conversations from a variety of actors to explore the perceptual outcomes associated with the modalities of speech and motion. Specifically, we contrast full performance-driven characters to those portrayed by generated gestures and synthesized speech, analysing how the features of each influence portrayed personality according to the Big Five personality traits. We find that processing speech and motion can have mixed effects on such traits, with our results highlighting motion as the dominant modality for portraying extraversion and speech as dominant for communicating agreeableness and emotional stability. Our results can support the Extended Reality (XR) community in development of virtual characters, social agents and 3D User Interface (3DUI) agents portraying a range of targeted personalities
An End-to-End Conversational Style Matching Agent
We present an end-to-end voice-based conversational agent that is able to
engage in naturalistic multi-turn dialogue and align with the interlocutor's
conversational style. The system uses a series of deep neural network
components for speech recognition, dialogue generation, prosodic analysis and
speech synthesis to generate language and prosodic expression with qualities
that match those of the user. We conducted a user study (N=30) in which
participants talked with the agent for 15 to 20 minutes, resulting in over 8
hours of natural interaction data. Users with high consideration conversational
styles reported the agent to be more trustworthy when it matched their
conversational style. Whereas, users with high involvement conversational
styles were indifferent. Finally, we provide design guidelines for multi-turn
dialogue interactions using conversational style adaptation
The Importance of Multimodal Emotion Conditioning and Affect Consistency for Embodied Conversational Agents
Previous studies regarding the perception of emotions for embodied virtual
agents have shown the effectiveness of using virtual characters in conveying
emotions through interactions with humans. However, creating an autonomous
embodied conversational agent with expressive behaviors presents two major
challenges. The first challenge is the difficulty of synthesizing the
conversational behaviors for each modality that are as expressive as real human
behaviors. The second challenge is that the affects are modeled independently,
which makes it difficult to generate multimodal responses with consistent
emotions across all modalities. In this work, we propose a conceptual
framework, ACTOR (Affect-Consistent mulTimodal behaviOR generation), that aims
to increase the perception of affects by generating multimodal behaviors
conditioned on a consistent driving affect. We have conducted a user study with
199 participants to assess how the average person judges the affects perceived
from multimodal behaviors that are consistent and inconsistent with respect to
a driving affect. The result shows that among all model conditions, our
affect-consistent framework receives the highest Likert scores for the
perception of driving affects. Our statistical analysis suggests that making a
modality affect-inconsistent significantly decreases the perception of driving
affects. We also observe that multimodal behaviors conditioned on consistent
affects are more expressive compared to behaviors with inconsistent affects.
Therefore, we conclude that multimodal emotion conditioning and affect
consistency are vital to enhancing the perception of affects for embodied
conversational agents
ECA gesture strategies for robust SLDSs
This paper explores the use of embodied conversational agents (ECAs) to improve interaction with spoken language dialogue systems (SLDSs). For this purpose we have identified typical interaction problems with SLDSs and associated with each of them a particular ECA gesture or behaviour. User tests were carried out dividing the test users into two groups, each facing a different interaction metaphor (one with an ECA in the interface, and the other implemented only with voice). Our results suggest user frustration is lower when an ECA is present in the interface, and the dialogue flows more smoothly, partly due to the fact that users are better able to tell when they are expected to speak and whether the system has heard and understood. The users’ overall perceptions regarding the system were also affected, and interaction seems to be more enjoyable with an ECA than without it
- …