177 research outputs found

    Design Principles for Special Purpose, Embodied, Conversational Intelligence with Environmental Sensors (SPECIES) Agents

    Get PDF
    As information systems increase their ability to gather and analyze data from the natural environment and as computational power increases, the next generation of human-computer interfaces will be able to facilitate more lifelike and natural interactions with humans. This can be accomplished by using sensors to non-invasively gather information from the user, using artificial intelligence to interpret this information to perceive users’ emotional and cognitive states, and using customized interfaces and responses based on embodied-conversational-agent (avatar) technology to respond to the user. We refer to this novel and unique class of intelligent agents as Special Purpose Embodied Conversational Intelligence with Environmental Sensors (SPECIES) agents. In this paper, we build on interpersonal communication theory to specify four essential design principles of all SPECIES agents. We also share findings of initial research that demonstrates how SPECIES agents can be deployed to augment human tasks. Results of this paper organize future research efforts in collectively studying and creating more robust, influential, and intelligent SPECIES agents

    Prominence Driven Character Animation

    Get PDF
    This paper details the development of a fully automated system for character animation implemented in Autodesk Maya. The system uses prioritised speech events to algorithmically generate head, body, arms and leg movements alongside eyeblinks, eyebrow movements and lip-synching. In addition, gaze tracking is also generated automatically relative to the definition of focus objects- contextually important objects in the character\u27s worldview. The plugin uses an animation profile to store the relevant controllers and movements for a specific character, allowing any character to run with the system. Once a profile has been created, an audio file can be loaded and animated with a single button click. The average time to animate is between 2-3 minutes for 1 minute of speech, and the plugin can be used either as a first pass system for high quality work or as part of a batch animation workflow for larger amounts of content as exemplified in television and online dissemination channels

    Social Intelligence Design 2007. Proceedings Sixth Workshop on Social Intelligence Design

    Get PDF

    Communicative humanoids : a computational model of psychosocial dialogue skills

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1996.Includes bibliographical references (p. [223]-238).Kristinn RĂşnar ThĂłrisson.Ph.D

    Participant responses to virtual agents in immersive virtual environments.

    Get PDF
    This thesis is concerned with interaction between people and virtual humans in the context of highly immersive virtual environments (VEs). Empirical studies have shown that virtual humans (agents) with even minimal behavioural capabilities can have a significant emotional impact on participants of immersive virtual environments (IVEs) to the extent that these have been used in studies of mental health issues such as social phobia and paranoia. This thesis focuses on understanding the impact on the responses of people to the behaviour of virtual humans rather than their visual appearance. There are three main research questions addressed. First, the thesis considers what are the key nonverbal behavioural cues used to portray a specific psychological state. Second, research determines the extent to which the underlying state of a virtual human is recognisable through the display of a key set of cues inferred from the behaviour of real humans. Finally, the degree to which a perceived psychological state in a virtual human invokes responses from participants in immersive virtual environments that are similar to those observed in the physical world is considered. These research questions were investigated through four experiments. The first experiment focused on the impact of visual fidelity and behavioural complexity on participant responses by implementing a model of gaze behaviour in virtual humans. The results of the study concluded that participants expected more life-like behaviours from more visually realistic virtual humans. The second experiment investigated the detrimental effects on participant responses when interacting with virtual humans with low behavioural complexity. The third experiment investigated the differences in responses of participants to virtual humans perceived to be in varying emotional states. The emotional states of the virtual humans were portrayed using postural and facial cues. Results indicated that posture does play an important role in the portrayal of affect however the behavioural model used in the study did not fully cover the qualities of body movement associated with the emotions studied. The final experiment focused on the portrayal of affect through the quality of body movement such as the speed of gestures. The effectiveness of the virtual humans was gauged through exploring a variety of participant responses including subjective responses, objective physiological and behavioural measures. The results show that participants are affected and respond to virtual humans in a significant manner provided that an appropriate behavioural model is used

    Relational agents : effecting change through human-computer relationships

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2003.Includes bibliographical references (p. 205-219).What kinds of social relationships can people have with computers? Are there activities that computers can engage in that actively draw people into relationships with them? What are the potential benefits to the people who participate in these human-computer relationships? To address these questions this work introduces a theory of Relational Agents, which are computational artifacts designed to build and maintain long-term, social-emotional relationships with their users. These can be purely software humanoid animated agents--as developed in this work--but they can also be non-humanoid or embodied in various physical forms, from robots, to pets, to jewelry, clothing, hand-helds, and other interactive devices. Central to the notion of relationship is that it is a persistent construct, spanning multiple interactions; thus, Relational Agents are explicitly designed to remember past history and manage future expectations in their interactions with users. Finally, relationships are fundamentally social and emotional, and detailed knowledge of human social psychology--with a particular emphasis on the role of affect--must be incorporated into these agents if they are to effectively leverage the mechanisms of human social cognition in order to build relationships in the most natural manner possible. People build relationships primarily through the use of language, and primarily within the context of face-to-face conversation. Embodied Conversational Agents--anthropomorphic computer characters that emulate the experience of face-to-face conversation--thus provide the substrate for this work, and so the relational activities provided by the theory will primarily be specific types of verbal and nonverbal conversational behaviors used by people to negotiate and maintain relationships.(cont.) This work also provides an analysis of the types of applications in which having a human-computer relationship is advantageous to the human participant. In addition to applications in which the relationship is an end in itself (e.g., in entertainment systems), human-computer relationships are important in tasks in which the human is attempting to undergo some change in behavior or cognitive or emotional state. One such application is explored here: a system for assisting the user through a month-long health behavior change program in the area of exercise adoption. This application involves the research, design and implementation of relational agents as well as empirical evaluation of their ability to build relationships and effect change over a series of interactions with users.by Timothy Wallace Bickmore.Ph.D

    Evaluating humanoid embodied conversational agents in mobile guide applications

    Get PDF
    Evolution in the area of mobile computing has been phenomenal in the last few years. The exploding increase in hardware power has enabled multimodal mobile interfaces to be developed. These interfaces differ from the traditional graphical user interface (GUI), in that they enable a more “natural” communication with mobile devices, through the use of multiple communication channels (e.g., multi-touch, speech recognition, etc.). As a result, a new generation of applications has emerged that provide human-like assistance in the user interface (e.g., the Siri conversational assistant (Siri Inc., visited 2010)). These conversational agents are currently designed to automate a number of tedious mobile tasks (e.g., to call a taxi), but the possible applications are endless. A domain of particular interest is that of Cultural Heritage, where conversational agents can act as personalized tour guides in, for example, archaeological attractions. The visitors to historical places have a diverse range of information needs. For example, casual visitors have different information needs from those with a deeper interest in an attraction (e.g., - holiday learners versus students). A personalized conversational agent can access a cultural heritage database, and effectively translate data into a natural language form that is adapted to the visitor’s personal needs and interests. The present research aims to investigate the information needs of a specific type of visitors, those for whom retention of cultural content is important (e.g., students of history, cultural experts, history hobbyists, educators, etc.). Embodying a conversational agent enables the agent to use additional modalities to communicate this content (e.g., through facial expressions, deictic gestures, etc.) to the user. Simulating the social norms that guide the real-world human-to-human interaction (e.g., adapting the story based on the reactions of the users), should at least theoretically optimize the cognitive accessibility of the content. Although a number of projects have attempted to build embodied conversational agents (ECAs) for cultural heritage, little is known about their impact on the users’ perceived cognitive accessibility of the cultural heritage content, and the usability of the interfaces they support. In particular, there is a general disagreement on the advantages of multimodal ECAs in terms of users’ task performance and satisfaction over nonanthropomorphised interfaces. Further, little is known about what features influence what aspects of the cognitive accessibility of the content and/or usability of the interface. To address these questions I studied the user experiences with ECA interfaces in six user studies across three countries (Greece, UK and USA). To support these studies, I introduced: a) a conceptual framework based on well-established theoretical models of human cognition, and previous frameworks from the literature. The framework offers a holistic view of the design space of ECA systems b) a research technique for evaluating the cognitive accessibility of ECA-based information presentation systems that combine data from eye tracking and facial expression recognition. In addition, I designed a toolkit, from which I partially developed its natural language processing component, to facilitate rapid development of mobile guide applications using ECAs. Results from these studies provide evidence that an ECA, capable of displaying some of the communication strategies (e.g., non-verbal behaviours to accompany linguistic information etc.) found in the real-world human guidance scenario, is not affecting and effective in enhancing the user’s ability to retain cultural content. The findings from the first two studies, suggest than an ECA has no negative/positive impact on users experiencing content that is similar (but not the same) across different locations (see experiment one, in Chapter 7), and content of variable difficulty (see experiment two, in Chapter 7). However, my results also suggest that improving the degree of content personalization and the quality of the modalities used by the ECA can result in both effective and affecting human-ECA interactions. Effectiveness is the degree to which an ECA facilitates a user in accomplishing the navigation and information tasks. Similarly, affecting is the degree to which the ECA changes the quality of the user’s experience while accomplishing the navigation and information tasks. By adhering to the above rules, I gradually improved my designs and built ECAs that are affecting. In particular, I found that an ECA can affect the quality of the user’s navigation experience (see experiment three in Chapter 7), as well as how a user experiences narrations of cultural value (see experiment five, in Chapter 8). In terms of navigation, I found sound evidence that the strongest impact of the ECAs nonverbal behaviours is on the ability of users to correctly disambiguate the navigation of an ECA instructions provided by a tour guide system. However, my ECAs failed to become effective, and to elicit enhanced navigation or retention performances. Given the positive impact of ECAs on the disambiguation of navigation instructions, the lack of ECA-effectiveness in navigation could be attributed to the simulated mobile conditions. In a real outdoor environment, where users would have to actually walk around the castle, an ECA could have elicited better navigation performance, than a system without it. With regards to retention performance, my results suggest that a designer should not solely consider the impact of an ECA, but also the style and effectiveness of the question-answering (Q&A) with the ECA, and the type of user interacting with the ECA (see experiments four and six, in Chapter 8). I found that that there is a correlation between how many questions participants asked per location for a tour, and the information they retained after the completion of the tour. When participants were requested to ask the systems a specific number of questions per location, they could retain more information than when they were allowed to freely ask questions. However, the constrained style of interaction decreased their overall satisfaction with the systems. Therefore, when enhanced retention performance is needed, a designer should consider strategies that should direct users to ask a specific number of questions per location for a tour. On the other hand, when maintaining the positive levels of user experiences is the desired outcome of an interaction, users should be allowed to freely ask questions. Then, the effectiveness of the Q&A session is of importance to the success/failure of the user’s interaction with the ECA. In a natural-language question-answering system, the system often fails to understand the user’s question and, by default, it asks the user to rephrase again. A problem arises when the system fails to understand a question repeatedly. I found that a repetitive request to rephrase the same question annoys participants and affects their retention performance. Therefore, in order to ensure effective human-ECA Q&A, the repeat messages should be built in a way to allow users to figure out how to ask the system questions to avoid improper responses. Then, I found strong evidence that an ECA may be effective for some type of users, while for some others it may be not. I found that an ECA with an attention-grabbing mechanism (see experiment six, in Chapter 8), had an inverse effect on the retention performance of participants with different gender. In particular, it enhanced the retention performance of the male participants, while it degraded the retention performance of the female participants. Finally, a series of tentative design recommendations for the design of both affecting and effective ECAs in mobile guide applications in derived from the work undertaken. These are aimed at ECA researchers and mobile guide designers

    Avatar augmented online conversation

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2003.Includes bibliographical references (p. 167-175).One of the most important roles played by technology is connecting people and mediating their communication with one another. Building technology that mediates conversation presents a number of challenging research and design questions. Apart from the fundamental issue of what exactly gets mediated, two of the more crucial questions are how the person being mediated interacts with the mediating layer and how the receiving person experiences the mediation. This thesis is concerned with both of these questions and proposes a theoretical framework of mediated conversation by means of automated avatars. This new approach relies on a model of face-to-face conversation, and derives an architecture for implementing these features through automation. First the thesis describes the process of face-to-face conversation and what nonverbal behaviors contribute to its success. It then presents a theoretical framework that explains how a text message can be automatically analyzed in terms of its communicative function based on discourse context, and how behaviors, shown to support those same functions in face-to-face conversation, can then be automatically performed by a graphical avatar in synchrony with the message delivery. An architecture, Spark, built on this framework demonstrates the approach in an actual system design that introduces the concept of a message transformation pipeline, abstracting function from behavior, and the concept of an avatar agent, responsible for coordinated delivery and continuous maintenance of the communication channel. A derived application, MapChat, is an online collaboration system where users represented by avatars in a shared virtual environment can chat and manipulate an interactive map while their avatars generate face-to-face behaviors.(cont.) A study evaluating the strength of the approach compares groups collaborating on a route-planning task using MapChat with and without the animated avatars. The results show that while task outcome was equally good for both groups, the group using these avatars felt that the task was significantly less difficult, and the feeling of efficiency and consensus were significantly stronger. An analysis of the conversation transcripts shows a significant improvement of the overall conversational process and significantly fewer messages spent on channel maintenance in the avatar groups. The avatars also significantly improved the users' perception of each others' effort. Finally, MapChat with avatars was found to be significantly more personal, enjoyable, and easier to use. The ramifications of these findings with respect to mediating conversation are discussed.by Hannes Högni. Vilhjálmsson.Ph.D

    A Proactive Approach of Robotic Framework for Making Eye Contact with Humans

    Get PDF
    Making eye contact is a most important prerequisite function of humans to initiate a conversation with others. However, it is not an easy task for a robot to make eye contact with a human if they are not facing each other initially or the human is intensely engaged his/her task. If the robot would like to start communication with a particular person, it should turn its gaze to that person and make eye contact with him/her. However, such a turning action alone is not enough to set up an eye contact phenomenon in all cases. Therefore, the robot should perform some stronger actions in some situations so that it can attract the target person before meeting his/her gaze. In this paper, we proposed a conceptual model of eye contact for social robots consisting of two phases: capturing attention and ensuring the attention capture. Evaluation experiments with human participants reveal the effectiveness of the proposed model in four viewing situations, namely, central field of view, near peripheral field of view, far peripheral field of view, and out of field of view
    • …
    corecore