5,214 research outputs found
Speech-driven Animation with Meaningful Behaviors
Conversational agents (CAs) play an important role in human computer
interaction. Creating believable movements for CAs is challenging, since the
movements have to be meaningful and natural, reflecting the coupling between
gestures and speech. Studies in the past have mainly relied on rule-based or
data-driven approaches. Rule-based methods focus on creating meaningful
behaviors conveying the underlying message, but the gestures cannot be easily
synchronized with speech. Data-driven approaches, especially speech-driven
models, can capture the relationship between speech and gestures. However, they
create behaviors disregarding the meaning of the message. This study proposes
to bridge the gap between these two approaches overcoming their limitations.
The approach builds a dynamic Bayesian network (DBN), where a discrete variable
is added to constrain the behaviors on the underlying constraint. The study
implements and evaluates the approach with two constraints: discourse functions
and prototypical behaviors. By constraining on the discourse functions (e.g.,
questions), the model learns the characteristic behaviors associated with a
given discourse class learning the rules from the data. By constraining on
prototypical behaviors (e.g., head nods), the approach can be embedded in a
rule-based system as a behavior realizer creating trajectories that are timely
synchronized with speech. The study proposes a DBN structure and a training
approach that (1) models the cause-effect relationship between the constraint
and the gestures, (2) initializes the state configuration models increasing the
range of the generated behaviors, and (3) captures the differences in the
behaviors across constraints by enforcing sparse transitions between shared and
exclusive states per constraint. Objective and subjective evaluations
demonstrate the benefits of the proposed approach over an unconstrained model.Comment: 13 pages, 12 figures, 5 table
LANGUAGE USE AND PERCEPTIONS OF ENGLISH AS A FOREIGN LANGUAGE (EFL) LEARNERS IN A TASK-BASED CLASS IN "SECOND LIFE "
Situated in cognitive interactionist theory and driven by task-based language teaching (TBLT), this study employed a multiple methods design to better address research questions regarding EFL learners' language use and perceptions about their language practices during task-based interaction in Second Life (SL). Findings showed that students perceived SL as a viable platform for language learning. Nine adult EFL learners worldwide were recruited to participate in this virtual course and used avatars to interact with peers via voice chat in simulated real-life tasks.
Quantitative results revealed that confirmation checks, clarification requests and comprehension checks were the three most frequently used strategies. Two strategies that had not been documented in previous SL research were found--metacognitive strategy and "spell out the word." Negotiation patterns were also identified: single-layered and multi-layered trigger-resolution sequences. Additionally, the interrelationship among task types, negotiation and strategies was established--jigsaw task prompted the most instances of negotiation and strategy use whereas opinion-exchange task triggered the least. Results also indicated that EFL students had a statistically significant improvement on syntactic complexity and variety as well as on linguistic accuracy across all measured levels.
Three core themes emerged from qualitative data: 1) perceptions about factors that impact virtual learning experience in SL, 2) attitudes toward learning English via avatars in SL, and 3) beliefs about the effects of task-based instruction on learning outcomes in SL. SL was endorsed as a promising learning environment owing to its conspicuous features, simulated immersion, augmented reality, tele/copresence and masked identities via avatars.
This study demonstrated that implementation of task-based instruction can be maximized by 3-D, simulated features in SL, as evidenced in that 1) convergent tasks with single-outcome conditions stimulate more cognitive and linguistic processes; 2) 3-D multimodal resources in SL provide additional visual and linguistic support; 3) pre-task planning can optimize the quality of learners' linguistic performance; 4) real-life tasks that capitalize on SL features, accommodate learners' cultural/world knowledge, and simulate real-life tasks can make a difference in their virtual learning experiences; and 5) avatar identities boost learners' sense of self-image and confidence
Exploring the Affective Loop
Research in psychology and neurology shows that both body and mind are
involved when experiencing emotions (Damasio 1994, Davidson et al.
2003). People are also very physical when they try to communicate their
emotions. Somewhere in between beings consciously and unconsciously
aware of it ourselves, we produce both verbal and physical signs to make
other people understand how we feel. Simultaneously, this production of
signs involves us in a stronger personal experience of the emotions we
express.
Emotions are also communicated in the digital world, but there is little
focus on users' personal as well as physical experience of emotions in
the available digital media. In order to explore whether and how we can
expand existing media, we have designed, implemented and evaluated
/eMoto/, a mobile service for sending affective messages to others. With
eMoto, we explicitly aim to address both cognitive and physical
experiences of human emotions. Through combining affective gestures for
input with affective expressions that make use of colors, shapes and
animations for the background of messages, the interaction "pulls" the
user into an /affective loop/. In this thesis we define what we mean by
affective loop and present a user-centered design approach expressed
through four design principles inspired by previous work within Human
Computer Interaction (HCI) but adjusted to our purposes; /embodiment/
(Dourish 2001) as a means to address how people communicate emotions in
real life, /flow/ (Csikszentmihalyi 1990) to reach a state of
involvement that goes further than the current context, /ambiguity/ of
the designed expressions (Gaver et al. 2003) to allow for open-ended
interpretation by the end-users instead of simplistic, one-emotion
one-expression pairs and /natural but designed expressions/ to address
people's natural couplings between cognitively and physically
experienced emotions. We also present results from an end-user study of
eMoto that indicates that subjects got both physically and emotionally
involved in the interaction and that the designed "openness" and
ambiguity of the expressions, was appreciated and understood by our
subjects. Through the user study, we identified four potential design
problems that have to be tackled in order to achieve an affective loop
effect; the extent to which users' /feel in control/ of the interaction,
/harmony and coherence/ between cognitive and physical expressions/,/
/timing/ of expressions and feedback in a communicational setting, and
effects of users' /personality/ on their emotional expressions and
experiences of the interaction
Negotiation of meaning via virtual exchange in immersive virtual reality environments
This study examines how English-as-lingua-franca (ELF) learners employ semiotic resources, including head movements, gestures, facial expression, body posture, and spatial juxtaposition, to negotiate for meaning in an immersive virtual reality (VR) environment. Ten ELF learners participated in a Taiwan-Spain VR virtual exchange project and completed two VR tasks on an immersive VR platform. Multiple datasets, including the recordings of VR sessions, pre- and post-task questionnaires, observation notes, and stimulated recall interviews, were analyzed quantitatively and qualitatively with triangulation. Built upon multimodal interaction analysis (Norris, 2004) and Varonis and Gassâ (1985a) negotiation of meaning model, the findings indicate that ELF learners utilized different embodied semiotic resources in constructing and negotiating meaning at all primes to achieve effective communication in an immersive VR space. The avatar-mediated representations and semiotic modalities were shown to facilitate indication, comprehension, and explanation to signal and resolve non-understanding instances. The findings show that with space proxemics and object handling as the two distinct features of VR-supported environments, VR platforms transform learnersâ social interaction from plane to three-dimensional communication, and from verbal to embodied, which promotes embodied learning. VR thus serves as a powerful immersive interactive environment for ELF learners from distant locations to be engaged in situated languacultural practices that goes beyond physical space. Pedagogical implications are discussed
Virtual Assisted Self Interviewing (VASI): An Expansion of Survey Data Collection Methods to the Virtual Worlds by Means of VDCI
Changes in communication technology have allowed for the expansion of data collection modes in survey research. The proliferation of the computer has allowed the creation of web and computer assisted auto-interview data collection modes. Virtual worlds are a new application of computer technology that once again expands the data collection modes by VASI (Virtual Assisted Self Interviewing). The Virtual Data Collection Interface (VDCI) developed at Indiana University in collaboration with the German Socio-Economic Panel Study (SOEP) allows survey researchers access to the population of virtual worlds in fully immersive Heads-up Display (HUD)-based survey instruments. This expansion needs careful consideration for its applicability to the researcherâs question but offers a high level of data integrity and expanded survey availability and automation. Current open questions of the VASI method are an optimal sampling frame and sampling procedures within e. g. a virtual world like Second Life (SL). Further multimodal studies are proposed to aid in evaluating the VDCI and placing it in context of other data collection modes.Interviewing Mode, PAPI, CAPI, CASI, VASI, VDCI, Second Life
EFL learnersâ strategy use during task-based interaction in Second Life
Motivated by theoretical and pedagogical concerns that the link between second language (L2) learnersâ second language acquisition (SLA) and language use in 3D multi-user virtual environments (MUVEs) is still not fully connected in current SLA literature, this study examined the patterns of English as a foreign language (EFL) learnersâ employment of communication strategies during task-based interaction in Second Life (SL). Nine adult EFL learners worldwide were recruited, and they used their avatars to negotiate meaning with peers in interactional tasks via voice chat in SL. Results reveal that confirmation checks, clarification requests, and comprehension checks were the most frequently used strategies. Other types of strategy use were also discovered, such as a request for help, self-correction, and topic shift â accompanied by a metacognitive strategy and spell-out-the-word that had not been previously documented in task-based research in 3D MUVEs. This study demonstrated that SL could offer an optimal venue for EFL learnersâ language acquisition to take place and prompt their cognitive processing during task-based interaction. Additionally, 3D multimodal resources afforded by SL provide additional visual support for EFL studentsâ input acquisition and output modifications. A call for more research on voice-based task interaction in 3D MUVEs is also needed
Beyond âInteractionâ: How to Understand Social Effects on Social Cognition
In recent years, a number of philosophers and cognitive scientists have advocated for an âinteractive turnâ in the methodology of social-cognition research: to become more ecologically valid, we must design experiments that are interactive, rather than merely observational. While the practical aim of improving ecological validity in the study of social cognition is laudable, we think that the notion of âinteractionâ is not suitable for this task: as it is currently deployed in the social cognition literature, this notion leads to serious conceptual and methodological confusion. In this paper, we tackle this confusion on three fronts: 1) we revise the âinteractionistâ definition of interaction; 2) we demonstrate a number of potential methodological confounds that arise in interactive experimental designs; and 3) we show that ersatz interactivity works just as well as the real thing. We conclude that the notion of âinteractionâ, as it is currently being deployed in this literature, obscures an accurate understanding of human social cognition
Building Embodied Conversational Agents:Observations on human nonverbal behaviour as a resource for the development of artificial characters
"Wow this is so cool!" This is what I most probably yelled, back in the 90s, when my first computer program on our MSX computer turned out to do exactly what I wanted it to do. The program contained the following instruction: COLOR 10(1.1) After hitting enter, it would change the screen color from light blue to dark yellow. A few years after that experience, Microsoft Windows was introduced. Windows came with an intuitive graphical user interface that was designed to allow all people, so also those who would not consider themselves to be experienced computer addicts, to interact with the computer. This was a major step forward in human-computer interaction, as from that point forward no complex programming skills were required anymore to perform such actions as adapting the screen color. Changing the background was just a matter of pointing the mouse to the desired color on a color palette. "Wow this is so cool!". This is what I shouted, again, 20 years later. This time my new smartphone successfully skipped to the next song on Spotify because I literally told my smartphone, with my voice, to do so. Being able to operate your smartphone with natural language through voice-control can be extremely handy, for instance when listening to music while showering. Again, the option to handle a computer with voice instructions turned out to be a significant optimization in human-computer interaction. From now on, computers could be instructed without the use of a screen, mouse or keyboard, and instead could operate successfully simply by telling the machine what to do. In other words, I have personally witnessed how, within only a few decades, the way people interact with computers has changed drastically, starting as a rather technical and abstract enterprise to becoming something that was both natural and intuitive, and did not require any advanced computer background. Accordingly, while computers used to be machines that could only be operated by technically-oriented individuals, they had gradually changed into devices that are part of many peopleâs household, just as much as a television, a vacuum cleaner or a microwave oven. The introduction of voice control is a significant feature of the newer generation of interfaces in the sense that these have become more "antropomorphic" and try to mimic the way people interact in daily life, where indeed the voice is a universally used device that humans exploit in their exchanges with others. The question then arises whether it would be possible to go even one step further, where people, like in science-fiction movies, interact with avatars or humanoid robots, whereby users can have a proper conversation with a computer-simulated human that is indistinguishable from a real human. An interaction with a human-like representation of a computer that behaves, talks and reacts like a real person would imply that the computer is able to not only produce and understand messages transmitted auditorily through the voice, but also could rely on the perception and generation of different forms of body language, such as facial expressions, gestures or body posture. At the time of writing, developments of this next step in human-computer interaction are in full swing, but the type of such interactions is still rather constrained when compared to the way humans have their exchanges with other humans. It is interesting to reflect on how such future humanmachine interactions may look like. When we consider other products that have been created in history, it sometimes is striking to see that some of these have been inspired by things that can be observed in our environment, yet at the same do not have to be exact copies of those phenomena. For instance, an airplane has wings just as birds, yet the wings of an airplane do not make those typical movements a bird would produce to fly. Moreover, an airplane has wheels, whereas a bird has legs. At the same time, an airplane has made it possible for a humans to cover long distances in a fast and smooth manner in a way that was unthinkable before it was invented. The example of the airplane shows how new technologies can have "unnatural" properties, but can nonetheless be very beneficial and impactful for human beings. This dissertation centers on this practical question of how virtual humans can be programmed to act more human-like. The four studies presented in this dissertation all have the equivalent underlying question of how parts of human behavior can be captured, such that computers can use it to become more human-like. Each study differs in method, perspective and specific questions, but they are all aimed to gain insights and directions that would help further push the computer developments of human-like behavior and investigate (the simulation of) human conversational behavior. The rest of this introductory chapter gives a general overview of virtual humans (also known as embodied conversational agents), their potential uses and the engineering challenges, followed by an overview of the four studies
- âŠ