5,214 research outputs found

    Speech-driven Animation with Meaningful Behaviors

    Full text link
    Conversational agents (CAs) play an important role in human computer interaction. Creating believable movements for CAs is challenging, since the movements have to be meaningful and natural, reflecting the coupling between gestures and speech. Studies in the past have mainly relied on rule-based or data-driven approaches. Rule-based methods focus on creating meaningful behaviors conveying the underlying message, but the gestures cannot be easily synchronized with speech. Data-driven approaches, especially speech-driven models, can capture the relationship between speech and gestures. However, they create behaviors disregarding the meaning of the message. This study proposes to bridge the gap between these two approaches overcoming their limitations. The approach builds a dynamic Bayesian network (DBN), where a discrete variable is added to constrain the behaviors on the underlying constraint. The study implements and evaluates the approach with two constraints: discourse functions and prototypical behaviors. By constraining on the discourse functions (e.g., questions), the model learns the characteristic behaviors associated with a given discourse class learning the rules from the data. By constraining on prototypical behaviors (e.g., head nods), the approach can be embedded in a rule-based system as a behavior realizer creating trajectories that are timely synchronized with speech. The study proposes a DBN structure and a training approach that (1) models the cause-effect relationship between the constraint and the gestures, (2) initializes the state configuration models increasing the range of the generated behaviors, and (3) captures the differences in the behaviors across constraints by enforcing sparse transitions between shared and exclusive states per constraint. Objective and subjective evaluations demonstrate the benefits of the proposed approach over an unconstrained model.Comment: 13 pages, 12 figures, 5 table

    LANGUAGE USE AND PERCEPTIONS OF ENGLISH AS A FOREIGN LANGUAGE (EFL) LEARNERS IN A TASK-BASED CLASS IN "SECOND LIFE "

    Get PDF
    Situated in cognitive interactionist theory and driven by task-based language teaching (TBLT), this study employed a multiple methods design to better address research questions regarding EFL learners' language use and perceptions about their language practices during task-based interaction in Second Life (SL). Findings showed that students perceived SL as a viable platform for language learning. Nine adult EFL learners worldwide were recruited to participate in this virtual course and used avatars to interact with peers via voice chat in simulated real-life tasks. Quantitative results revealed that confirmation checks, clarification requests and comprehension checks were the three most frequently used strategies. Two strategies that had not been documented in previous SL research were found--metacognitive strategy and "spell out the word." Negotiation patterns were also identified: single-layered and multi-layered trigger-resolution sequences. Additionally, the interrelationship among task types, negotiation and strategies was established--jigsaw task prompted the most instances of negotiation and strategy use whereas opinion-exchange task triggered the least. Results also indicated that EFL students had a statistically significant improvement on syntactic complexity and variety as well as on linguistic accuracy across all measured levels. Three core themes emerged from qualitative data: 1) perceptions about factors that impact virtual learning experience in SL, 2) attitudes toward learning English via avatars in SL, and 3) beliefs about the effects of task-based instruction on learning outcomes in SL. SL was endorsed as a promising learning environment owing to its conspicuous features, simulated immersion, augmented reality, tele/copresence and masked identities via avatars. This study demonstrated that implementation of task-based instruction can be maximized by 3-D, simulated features in SL, as evidenced in that 1) convergent tasks with single-outcome conditions stimulate more cognitive and linguistic processes; 2) 3-D multimodal resources in SL provide additional visual and linguistic support; 3) pre-task planning can optimize the quality of learners' linguistic performance; 4) real-life tasks that capitalize on SL features, accommodate learners' cultural/world knowledge, and simulate real-life tasks can make a difference in their virtual learning experiences; and 5) avatar identities boost learners' sense of self-image and confidence

    Exploring the Affective Loop

    Get PDF
    Research in psychology and neurology shows that both body and mind are involved when experiencing emotions (Damasio 1994, Davidson et al. 2003). People are also very physical when they try to communicate their emotions. Somewhere in between beings consciously and unconsciously aware of it ourselves, we produce both verbal and physical signs to make other people understand how we feel. Simultaneously, this production of signs involves us in a stronger personal experience of the emotions we express. Emotions are also communicated in the digital world, but there is little focus on users' personal as well as physical experience of emotions in the available digital media. In order to explore whether and how we can expand existing media, we have designed, implemented and evaluated /eMoto/, a mobile service for sending affective messages to others. With eMoto, we explicitly aim to address both cognitive and physical experiences of human emotions. Through combining affective gestures for input with affective expressions that make use of colors, shapes and animations for the background of messages, the interaction "pulls" the user into an /affective loop/. In this thesis we define what we mean by affective loop and present a user-centered design approach expressed through four design principles inspired by previous work within Human Computer Interaction (HCI) but adjusted to our purposes; /embodiment/ (Dourish 2001) as a means to address how people communicate emotions in real life, /flow/ (Csikszentmihalyi 1990) to reach a state of involvement that goes further than the current context, /ambiguity/ of the designed expressions (Gaver et al. 2003) to allow for open-ended interpretation by the end-users instead of simplistic, one-emotion one-expression pairs and /natural but designed expressions/ to address people's natural couplings between cognitively and physically experienced emotions. We also present results from an end-user study of eMoto that indicates that subjects got both physically and emotionally involved in the interaction and that the designed "openness" and ambiguity of the expressions, was appreciated and understood by our subjects. Through the user study, we identified four potential design problems that have to be tackled in order to achieve an affective loop effect; the extent to which users' /feel in control/ of the interaction, /harmony and coherence/ between cognitive and physical expressions/,/ /timing/ of expressions and feedback in a communicational setting, and effects of users' /personality/ on their emotional expressions and experiences of the interaction

    Negotiation of meaning via virtual exchange in immersive virtual reality environments

    Get PDF
    This study examines how English-as-lingua-franca (ELF) learners employ semiotic resources, including head movements, gestures, facial expression, body posture, and spatial juxtaposition, to negotiate for meaning in an immersive virtual reality (VR) environment. Ten ELF learners participated in a Taiwan-Spain VR virtual exchange project and completed two VR tasks on an immersive VR platform. Multiple datasets, including the recordings of VR sessions, pre- and post-task questionnaires, observation notes, and stimulated recall interviews, were analyzed quantitatively and qualitatively with triangulation. Built upon multimodal interaction analysis (Norris, 2004) and Varonis and Gass’ (1985a) negotiation of meaning model, the findings indicate that ELF learners utilized different embodied semiotic resources in constructing and negotiating meaning at all primes to achieve effective communication in an immersive VR space. The avatar-mediated representations and semiotic modalities were shown to facilitate indication, comprehension, and explanation to signal and resolve non-understanding instances. The findings show that with space proxemics and object handling as the two distinct features of VR-supported environments, VR platforms transform learners’ social interaction from plane to three-dimensional communication, and from verbal to embodied, which promotes embodied learning. VR thus serves as a powerful immersive interactive environment for ELF learners from distant locations to be engaged in situated languacultural practices that goes beyond physical space. Pedagogical implications are discussed

    Virtual Assisted Self Interviewing (VASI): An Expansion of Survey Data Collection Methods to the Virtual Worlds by Means of VDCI

    Get PDF
    Changes in communication technology have allowed for the expansion of data collection modes in survey research. The proliferation of the computer has allowed the creation of web and computer assisted auto-interview data collection modes. Virtual worlds are a new application of computer technology that once again expands the data collection modes by VASI (Virtual Assisted Self Interviewing). The Virtual Data Collection Interface (VDCI) developed at Indiana University in collaboration with the German Socio-Economic Panel Study (SOEP) allows survey researchers access to the population of virtual worlds in fully immersive Heads-up Display (HUD)-based survey instruments. This expansion needs careful consideration for its applicability to the researcher’s question but offers a high level of data integrity and expanded survey availability and automation. Current open questions of the VASI method are an optimal sampling frame and sampling procedures within e. g. a virtual world like Second Life (SL). Further multimodal studies are proposed to aid in evaluating the VDCI and placing it in context of other data collection modes.Interviewing Mode, PAPI, CAPI, CASI, VASI, VDCI, Second Life

    EFL learners’ strategy use during task-based interaction in Second Life

    Get PDF
    Motivated by theoretical and pedagogical concerns that the link between second language (L2) learners’ second language acquisition (SLA) and language use in 3D multi-user virtual environments (MUVEs) is still not fully connected in current SLA literature, this study examined the patterns of English as a foreign language (EFL) learners’ employment of communication strategies during task-based interaction in Second Life (SL). Nine adult EFL learners worldwide were recruited, and they used their avatars to negotiate meaning with peers in interactional tasks via voice chat in SL. Results reveal that confirmation checks, clarification requests, and comprehension checks were the most frequently used strategies. Other types of strategy use were also discovered, such as a request for help, self-correction, and topic shift – accompanied by a metacognitive strategy and spell-out-the-word that had not been previously documented in task-based research in 3D MUVEs. This study demonstrated that SL could offer an optimal venue for EFL learners’ language acquisition to take place and prompt their cognitive processing during task-based interaction. Additionally, 3D multimodal resources afforded by SL provide additional visual support for EFL students’ input acquisition and output modifications. A call for more research on voice-based task interaction in 3D MUVEs is also needed

    Beyond ‘Interaction’: How to Understand Social Effects on Social Cognition

    Get PDF
    In recent years, a number of philosophers and cognitive scientists have advocated for an ‘interactive turn’ in the methodology of social-cognition research: to become more ecologically valid, we must design experiments that are interactive, rather than merely observational. While the practical aim of improving ecological validity in the study of social cognition is laudable, we think that the notion of ‘interaction’ is not suitable for this task: as it is currently deployed in the social cognition literature, this notion leads to serious conceptual and methodological confusion. In this paper, we tackle this confusion on three fronts: 1) we revise the ‘interactionist’ definition of interaction; 2) we demonstrate a number of potential methodological confounds that arise in interactive experimental designs; and 3) we show that ersatz interactivity works just as well as the real thing. We conclude that the notion of ‘interaction’, as it is currently being deployed in this literature, obscures an accurate understanding of human social cognition

    Building Embodied Conversational Agents:Observations on human nonverbal behaviour as a resource for the development of artificial characters

    Get PDF
    "Wow this is so cool!" This is what I most probably yelled, back in the 90s, when my first computer program on our MSX computer turned out to do exactly what I wanted it to do. The program contained the following instruction: COLOR 10(1.1) After hitting enter, it would change the screen color from light blue to dark yellow. A few years after that experience, Microsoft Windows was introduced. Windows came with an intuitive graphical user interface that was designed to allow all people, so also those who would not consider themselves to be experienced computer addicts, to interact with the computer. This was a major step forward in human-computer interaction, as from that point forward no complex programming skills were required anymore to perform such actions as adapting the screen color. Changing the background was just a matter of pointing the mouse to the desired color on a color palette. "Wow this is so cool!". This is what I shouted, again, 20 years later. This time my new smartphone successfully skipped to the next song on Spotify because I literally told my smartphone, with my voice, to do so. Being able to operate your smartphone with natural language through voice-control can be extremely handy, for instance when listening to music while showering. Again, the option to handle a computer with voice instructions turned out to be a significant optimization in human-computer interaction. From now on, computers could be instructed without the use of a screen, mouse or keyboard, and instead could operate successfully simply by telling the machine what to do. In other words, I have personally witnessed how, within only a few decades, the way people interact with computers has changed drastically, starting as a rather technical and abstract enterprise to becoming something that was both natural and intuitive, and did not require any advanced computer background. Accordingly, while computers used to be machines that could only be operated by technically-oriented individuals, they had gradually changed into devices that are part of many people’s household, just as much as a television, a vacuum cleaner or a microwave oven. The introduction of voice control is a significant feature of the newer generation of interfaces in the sense that these have become more "antropomorphic" and try to mimic the way people interact in daily life, where indeed the voice is a universally used device that humans exploit in their exchanges with others. The question then arises whether it would be possible to go even one step further, where people, like in science-fiction movies, interact with avatars or humanoid robots, whereby users can have a proper conversation with a computer-simulated human that is indistinguishable from a real human. An interaction with a human-like representation of a computer that behaves, talks and reacts like a real person would imply that the computer is able to not only produce and understand messages transmitted auditorily through the voice, but also could rely on the perception and generation of different forms of body language, such as facial expressions, gestures or body posture. At the time of writing, developments of this next step in human-computer interaction are in full swing, but the type of such interactions is still rather constrained when compared to the way humans have their exchanges with other humans. It is interesting to reflect on how such future humanmachine interactions may look like. When we consider other products that have been created in history, it sometimes is striking to see that some of these have been inspired by things that can be observed in our environment, yet at the same do not have to be exact copies of those phenomena. For instance, an airplane has wings just as birds, yet the wings of an airplane do not make those typical movements a bird would produce to fly. Moreover, an airplane has wheels, whereas a bird has legs. At the same time, an airplane has made it possible for a humans to cover long distances in a fast and smooth manner in a way that was unthinkable before it was invented. The example of the airplane shows how new technologies can have "unnatural" properties, but can nonetheless be very beneficial and impactful for human beings. This dissertation centers on this practical question of how virtual humans can be programmed to act more human-like. The four studies presented in this dissertation all have the equivalent underlying question of how parts of human behavior can be captured, such that computers can use it to become more human-like. Each study differs in method, perspective and specific questions, but they are all aimed to gain insights and directions that would help further push the computer developments of human-like behavior and investigate (the simulation of) human conversational behavior. The rest of this introductory chapter gives a general overview of virtual humans (also known as embodied conversational agents), their potential uses and the engineering challenges, followed by an overview of the four studies
    • 

    corecore