573 research outputs found

    Proceedings

    Get PDF
    Proceedings of the 3rd Nordic Symposium on Multimodal Communication. Editors: Patrizia Paggio, Elisabeth Ahlsén, Jens Allwood, Kristiina Jokinen, Costanza Navarretta. NEALT Proceedings Series, Vol. 15 (2011), vi+87 pp. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/22532

    Feedback and gestural behaviour in a conversational corpus of Danish

    Get PDF
    Proceedings of the 3rd Nordic Symposium on Multimodal Communication. Editors: Patrizia Paggio, Elisabeth Ahlsén, Jens Allwood, Kristiina Jokinen, Costanza Navarretta. NEALT Proceedings Series, Vol. 15 (2011), 33–39. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/22532

    Eyebrow movements as signals of communicative problems in human face-to-face interaction

    Get PDF
    Repair is a core building block of human communication, allowing us to address problems of understanding in conversation. Past research has uncovered the basic mechanisms by which interactants signal and solve such problems. However, the focus has been on verbal interaction, neglecting the fact that human communication is inherently multimodal. Here, we focus on a visual signal particularly prevalent in signaling problems of understanding: eyebrow frowns and raises. We present a corpus study showing that verbal repair initiations with eyebrow furrows are more likely to be responded to with clarifications as repair solutions, repair initiations that were preceded by eyebrow actions as preliminaries get repaired faster (around 230 ms), and eyebrow furrows alone can be sufficient to occasion clarification. We also present an experiment based on virtual reality technology, revealing that addressees’ eyebrow frowns have a striking effect on speakers’ speech, leading them to produce answers to questions several seconds longer than when not perceiving addressee eyebrow furrows. Together, the findings demonstrate that eyebrow movements play a communicative role in initiating repair in spoken language rather than being merely epiphenomenal. Thus, they should be considered as core coordination devices in human conversational interaction

    Building Embodied Conversational Agents:Observations on human nonverbal behaviour as a resource for the development of artificial characters

    Get PDF
    "Wow this is so cool!" This is what I most probably yelled, back in the 90s, when my first computer program on our MSX computer turned out to do exactly what I wanted it to do. The program contained the following instruction: COLOR 10(1.1) After hitting enter, it would change the screen color from light blue to dark yellow. A few years after that experience, Microsoft Windows was introduced. Windows came with an intuitive graphical user interface that was designed to allow all people, so also those who would not consider themselves to be experienced computer addicts, to interact with the computer. This was a major step forward in human-computer interaction, as from that point forward no complex programming skills were required anymore to perform such actions as adapting the screen color. Changing the background was just a matter of pointing the mouse to the desired color on a color palette. "Wow this is so cool!". This is what I shouted, again, 20 years later. This time my new smartphone successfully skipped to the next song on Spotify because I literally told my smartphone, with my voice, to do so. Being able to operate your smartphone with natural language through voice-control can be extremely handy, for instance when listening to music while showering. Again, the option to handle a computer with voice instructions turned out to be a significant optimization in human-computer interaction. From now on, computers could be instructed without the use of a screen, mouse or keyboard, and instead could operate successfully simply by telling the machine what to do. In other words, I have personally witnessed how, within only a few decades, the way people interact with computers has changed drastically, starting as a rather technical and abstract enterprise to becoming something that was both natural and intuitive, and did not require any advanced computer background. Accordingly, while computers used to be machines that could only be operated by technically-oriented individuals, they had gradually changed into devices that are part of many people’s household, just as much as a television, a vacuum cleaner or a microwave oven. The introduction of voice control is a significant feature of the newer generation of interfaces in the sense that these have become more "antropomorphic" and try to mimic the way people interact in daily life, where indeed the voice is a universally used device that humans exploit in their exchanges with others. The question then arises whether it would be possible to go even one step further, where people, like in science-fiction movies, interact with avatars or humanoid robots, whereby users can have a proper conversation with a computer-simulated human that is indistinguishable from a real human. An interaction with a human-like representation of a computer that behaves, talks and reacts like a real person would imply that the computer is able to not only produce and understand messages transmitted auditorily through the voice, but also could rely on the perception and generation of different forms of body language, such as facial expressions, gestures or body posture. At the time of writing, developments of this next step in human-computer interaction are in full swing, but the type of such interactions is still rather constrained when compared to the way humans have their exchanges with other humans. It is interesting to reflect on how such future humanmachine interactions may look like. When we consider other products that have been created in history, it sometimes is striking to see that some of these have been inspired by things that can be observed in our environment, yet at the same do not have to be exact copies of those phenomena. For instance, an airplane has wings just as birds, yet the wings of an airplane do not make those typical movements a bird would produce to fly. Moreover, an airplane has wheels, whereas a bird has legs. At the same time, an airplane has made it possible for a humans to cover long distances in a fast and smooth manner in a way that was unthinkable before it was invented. The example of the airplane shows how new technologies can have "unnatural" properties, but can nonetheless be very beneficial and impactful for human beings. This dissertation centers on this practical question of how virtual humans can be programmed to act more human-like. The four studies presented in this dissertation all have the equivalent underlying question of how parts of human behavior can be captured, such that computers can use it to become more human-like. Each study differs in method, perspective and specific questions, but they are all aimed to gain insights and directions that would help further push the computer developments of human-like behavior and investigate (the simulation of) human conversational behavior. The rest of this introductory chapter gives a general overview of virtual humans (also known as embodied conversational agents), their potential uses and the engineering challenges, followed by an overview of the four studies

    Building Embodied Conversational Agents:Observations on human nonverbal behaviour as a resource for the development of artificial characters

    Get PDF
    "Wow this is so cool!" This is what I most probably yelled, back in the 90s, when my first computer program on our MSX computer turned out to do exactly what I wanted it to do. The program contained the following instruction: COLOR 10(1.1) After hitting enter, it would change the screen color from light blue to dark yellow. A few years after that experience, Microsoft Windows was introduced. Windows came with an intuitive graphical user interface that was designed to allow all people, so also those who would not consider themselves to be experienced computer addicts, to interact with the computer. This was a major step forward in human-computer interaction, as from that point forward no complex programming skills were required anymore to perform such actions as adapting the screen color. Changing the background was just a matter of pointing the mouse to the desired color on a color palette. "Wow this is so cool!". This is what I shouted, again, 20 years later. This time my new smartphone successfully skipped to the next song on Spotify because I literally told my smartphone, with my voice, to do so. Being able to operate your smartphone with natural language through voice-control can be extremely handy, for instance when listening to music while showering. Again, the option to handle a computer with voice instructions turned out to be a significant optimization in human-computer interaction. From now on, computers could be instructed without the use of a screen, mouse or keyboard, and instead could operate successfully simply by telling the machine what to do. In other words, I have personally witnessed how, within only a few decades, the way people interact with computers has changed drastically, starting as a rather technical and abstract enterprise to becoming something that was both natural and intuitive, and did not require any advanced computer background. Accordingly, while computers used to be machines that could only be operated by technically-oriented individuals, they had gradually changed into devices that are part of many people’s household, just as much as a television, a vacuum cleaner or a microwave oven. The introduction of voice control is a significant feature of the newer generation of interfaces in the sense that these have become more "antropomorphic" and try to mimic the way people interact in daily life, where indeed the voice is a universally used device that humans exploit in their exchanges with others. The question then arises whether it would be possible to go even one step further, where people, like in science-fiction movies, interact with avatars or humanoid robots, whereby users can have a proper conversation with a computer-simulated human that is indistinguishable from a real human. An interaction with a human-like representation of a computer that behaves, talks and reacts like a real person would imply that the computer is able to not only produce and understand messages transmitted auditorily through the voice, but also could rely on the perception and generation of different forms of body language, such as facial expressions, gestures or body posture. At the time of writing, developments of this next step in human-computer interaction are in full swing, but the type of such interactions is still rather constrained when compared to the way humans have their exchanges with other humans. It is interesting to reflect on how such future humanmachine interactions may look like. When we consider other products that have been created in history, it sometimes is striking to see that some of these have been inspired by things that can be observed in our environment, yet at the same do not have to be exact copies of those phenomena. For instance, an airplane has wings just as birds, yet the wings of an airplane do not make those typical movements a bird would produce to fly. Moreover, an airplane has wheels, whereas a bird has legs. At the same time, an airplane has made it possible for a humans to cover long distances in a fast and smooth manner in a way that was unthinkable before it was invented. The example of the airplane shows how new technologies can have "unnatural" properties, but can nonetheless be very beneficial and impactful for human beings. This dissertation centers on this practical question of how virtual humans can be programmed to act more human-like. The four studies presented in this dissertation all have the equivalent underlying question of how parts of human behavior can be captured, such that computers can use it to become more human-like. Each study differs in method, perspective and specific questions, but they are all aimed to gain insights and directions that would help further push the computer developments of human-like behavior and investigate (the simulation of) human conversational behavior. The rest of this introductory chapter gives a general overview of virtual humans (also known as embodied conversational agents), their potential uses and the engineering challenges, followed by an overview of the four studies

    Creating Comparable Multimodal Corpora for Nordic Languages

    Get PDF
    Proceedings of the 18th Nordic Conference of Computational Linguistics NODALIDA 2011. Editors: Bolette Sandford Pedersen, Gunta Nešpore and Inguna Skadiņa. NEALT Proceedings Series, Vol. 11 (2011), 153-160. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/16955

    A multi-modal corpus approach to the analysis of backchanneling behaviour

    Get PDF
    Current methodologies in corpus linguistics have revolutionised the way we look at language. They allow us to make objective observations about written and spoken language in use. However, most corpora are limited in scope because they are unable to capture language and communication beyond the word. This is problematic given that interaction is in fact multi-modal, as meaning is constructed through the interplay of text, gesture and prosody; a combination of verbal and non-verbal characteristics. This thesis outlines, then utilises, a multi-modal approach to corpus linguistics, and examines how such can be used to facilitate our explorations of backchanneling phenomena in conversation, such as gestural and verbal signals of active listenership. Backchannels have been seen as being highly conventionalised, they differ considerably in form, function, interlocutor and location (in context and co-text). Therefore their relevance at any given time in a given conversation is highly conditional. The thesis provides an in-depth investigation of the use of, and the relationship between, spoken and non-verbal forms of this behaviour, focusing on a particular sub-set of gestural forms: head nods. This investigation is undertaken by analysing the patterned use of specific forms and functions of backchannels within and across sentence boundaries, as evidenced in a five-hour sub-corpus of dyadic multi-modal conversational episodes, taken from the Nottingham Multi-Modal Corpus (NMMC). The results from this investigation reveal 22 key findings regarding the collaborative and cooperative nature of backchannels, which function to both support and extend what is already known about such behaviours. Using these findings, the thesis presents an adapted pragmatic-functional linguistic coding matrix for the classification and examination of backchanneling phenomena. This fuses the different, dynamic properties of spoken and non-verbal forms of this behaviour into a single, integrated conceptual model, in order to provide the foundations, a theoretical point-of-entry, for future research of this nature

    Computer-aided investigation of interaction mediated by an AR-enabled wearable interface

    Get PDF
    Dierker A. Computer-aided investigation of interaction mediated by an AR-enabled wearable interface. Bielefeld: Universitätsbibliothek Bielefeld; 2012.This thesis provides an approach on facilitating the analysis of nonverbal behaviour during human-human interaction. Thereby, much of the work that researchers do starting with experiment control, data acquisition, tagging and finally the analysis of the data is alleviated. For this, software and hardware techniques are used as sensor technology, machine learning, object tracking, data processing, visualisation and Augmented Reality. These are combined into an Augmented-Reality-enabled Interception Interface (ARbInI), a modular wearable interface for two users. The interface mediates the users’ interaction thereby intercepting and influencing it. The ARbInI interface consists of two identical setups of sensors and displays, which are mutually coupled. Combining cameras and microphones with sensors, the system offers to record rich multimodal interaction cues in an efficient way. The recorded data can be analysed online and offline for interaction features (e. g. head gestures in head movements, objects in joint attention, speech times) using integrated machine-learning approaches. The classified features can be tagged in the data. For a detailed analysis, the recorded multimodal data is transferred automatically into file bundles loadable in a standard annotation tool where the data can be further tagged by hand. For statistic analyses of the complete multimodal corpus, a toolbox for use in a standard statistics program allows to directly import the corpus and to automate the analysis of multimodal and complex relationships between arbitrary data types. When using the optional multimodal Augmented Reality techniques integrated into ARbInI, the camera records exactly what the participant can see and nothing more or less. The following additional advantages can be used during the experiment: (a) the experiment can be controlled by using the auditory or visual displays thereby ensuring controlled experimental conditions, (b) the experiment can be disturbed, thus offering to investigate how problems in interaction are discovered and solved, and (c) the experiment can be enhanced by interactively comprising the behaviour of the user thereby offering to investigate how users cope with novel interaction channels. This thesis introduces criteria for the design of scenarios in which interaction analysis can benefit from the experimentation interface and presents a set of scenarios. These scenarios are applied in several empirical studies thereby collecting multimodal corpora that particularly include head gestures. The capabilities of computer-aided interaction analysis for the investigation of speech, visual attention and head movements are illustrated on this empirical data. The effects of the head-mounted display (HMD) are evaluated thoroughly in two studies. The results show that the HMD users need more head movements to achieve the same shift of gaze direction and perform less head gestures with slower velocity and fewer repetitions compared to non-HMD users. From this, a reduced willingness to perform head movements if not necessary can be concluded. Moreover, compensation strategies are established like leaning backwards to enlarge the field of view, and increasing the number of utterances or changing the reference to objects to compensate for the absence of mutual eye contact. Two studies investigate the interaction while actively inducing misunderstandings. The participants here use compensation strategies like multiple verification questions and arbitrary gaze movements. Additionally, an enhancement method that highlights the visual attention of the interaction partner is evaluated in a search task. The results show a significantly shorter reaction time and fewer errors

    Moving together: the organisation of non-verbal cues during multiparty conversation

    Get PDF
    PhDConversation is a collaborative activity. In face-to-face interactions interlocutors have mutual access to a shared space. This thesis aims to explore the shared space as a resource for coordinating conversation. As well demonstrated in studies of two-person conversations, interlocutors can coordinate their speech and non-verbal behaviour in ways that manage the unfolding conversation. However, when scaling up from two people to three people interacting, the coordination challenges that the interlocutors face increase. In particular speakers must manage multiple listeners. This thesis examines the use of interlocutors’ bodies in shared space to coordinate their multiparty dialogue. The approach exploits corpora of motion captured triadic interactions. The thesis first explores how interlocutors coordinate their speech and non-verbal behaviour. Inter-person relationships are examined and compared with artificially created triples who did not interact. Results demonstrate that interlocutors avoid speaking and gesturing over each other, but tend to nod together. Evidence is presented that the two recipients of an utterance have different patterns of head and hand movement, and that some of the regularities of movement are correlated with the task structure. The empirical section concludes by uncovering a class of coordination events, termed simultaneous engagement events, that are unique to multiparty dialogue. They are constructed using combinations of speaker head orientation and gesture orientation. The events coordinate multiple recipients of the dialogue and potentially arise as a result of the greater coordination challenges that interlocutors face. They are marked in requiring a mutually accessible shared space in order to be considered an effective interactional cue. The thesis provides quantitative evidence that interlocutors’ head and hand movements are organised by their dialogue state and the task responsibilities that the bear. It is argued that a shared interaction space becomes a more important interactional resource when conversations scale up to three people

    ENGLISH AS A LINGUA FRANCA AND INTERCULTURAL COMMUNICATION: COMMUNICATION STRATEGIES AND MEANING NEGOTIATION IN ELF TRANSCULTURAL CONTEXTS

    Get PDF
    In ELF contexts both linguistic and cultural practices cannot be taken for granted, but they need to be jointly negotiated by interactants to create a shared frame of reference. Therefore, in this dissertation I suggest the expression ‘ELF Transcultural Communication’ to highlight the necessary link between ELF research and Intercultural Communication studies and I propose ‘ELF Transcultural Competence’ as a new model of reference for the skills that are necessary to effectively and appropriately achieve the speaker’s communicative goal(s) in ELF transcultural contexts. Following this perspective, the ability to negotiate mutual understanding and to strategically manage the interaction is fundamental. Hence, the study aims at investigating how communication strategies are used in ELF Transcultural Communication in the meaning making process and in the negotiation of cultural concepts and at exploring how their use can be included in an ELF-aware pedagogy. First, an overview of research on ELF and on Intercultural Communication is provided, discussing the concepts of language and culture as complex systems that emerge in interaction. Successively, traditional conceptualisations of Communicative Competence and Intercultural Communicative Competence are called into question, remarking their unsuitability for ELF transcultural contexts. In turn, the framework of ELF Transcultural Competence, based on the concepts of ELF Competence and Intercultural Awareness, is discussed as a more appropriate model for these contexts. The use of communication strategies in ELF Transcultural Communication are then outlined, discussing how meaning and understanding are negotiated and co-constructed in interaction and the relevance of communication strategies in these processes. The communication strategies that are analysed in the data have been selected from ELF literature on the topic: backchannels, lexical anticipations, lexical suggestions and corrections, over multilingual resources, reformulations, repetitions, and spellings. The data set of the dissertation is based on two ELF corpora, the VOICE-Leisure sub-corpus and the ViMELF corpus, and has been analysed through a mixed method approach that combines Conversation Analysis and descriptive statistics. The findings confirm what has been observed in ELF studies on the topic and they show that communication strategies are productive tools to actively co-construct mutual understanding and to negotiate meaning in interaction, playing a fundamental role in ELF Transcultural Communication. In addition, the strategic moves examined show a frequent co-occurrence, with several functions performed at once, showcasing how meaning can be negotiated in different ways, and how strategic communication is a fundamental aspect to consider when investigating ELF interactions. Finally, the discussion of the pedagogical implications is presented. The inclusion of the use of communication strategies aimed at strategically managing interaction in an ELF-aware pedagogy is introduced and illustrated through some practical activities
    corecore