1,591 research outputs found

    Spotting Agreement and Disagreement: A Survey of Nonverbal Audiovisual Cues and Tools

    Get PDF
    While detecting and interpreting temporal patterns of non–verbal behavioral cues in a given context is a natural and often unconscious process for humans, it remains a rather difficult task for computer systems. Nevertheless, it is an important one to achieve if the goal is to realise a naturalistic communication between humans and machines. Machines that are able to sense social attitudes like agreement and disagreement and respond to them in a meaningful way are likely to be welcomed by users due to the more natural, efficient and human–centered interaction they are bound to experience. This paper surveys the nonverbal cues that could be present during agreement and disagreement behavioural displays and lists a number of tools that could be useful in detecting them, as well as a few publicly available databases that could be used to train these tools for analysis of spontaneous, audiovisual instances of agreement and disagreement

    French Face-to-Face Interaction: Repetition as a Multimodal Resource

    Get PDF
    International audienceIn this chapter, after presenting the corpus as well as some of theannotations developed in the OTIM project, we then focus on the specificphenomenon of repetition. After briefly discussing this notion, we showthat different degrees of convergence can be achieved by speakersdepending on the multimodal complexity of the repetition and on thetiming in between the repeated element and the model. Although we focusmore specifically on the gestural level, we present a multimodal analysis ofgestural repetitions in which we met several issues linked to multimodalannotations of any type. This gives an overview of crucial issues in crosslevellinguistic annotation, such as the definition of a phenomenonincluding formal and/or functional categorization

    Prosody and Kinesics Based Co-analysis Towards Continuous Gesture Recognition

    Get PDF
    The aim of this study is to develop a multimodal co-analysis framework for continuous gesture recognition by exploiting prosodic and kinesics manifestation of natural communication. Using this framework, a co-analysis pattern between correlating components is obtained. The co-analysis pattern is clustered using K-means clustering to determine how well the pattern distinguishes the gestures. Features of the proposed approach that differentiate it from the other models are its less susceptibility to idiosyncrasies, its scalability, and simplicity. The experiment was performed on Multimodal Annotated Gesture Corpus (MAGEC) that we created for research on understanding non-verbal communication community, particularly the gestures

    Factors of Emotion and Affect in Designing Interactive Virtual Characters

    Get PDF
    The Arts: 1st Place (The Ohio State University Edward F. Hayes Graduate Research Forum)This paper represents a review of literature concerning factors of affective interactive virtual character design. Affect and it's related concepts are defined followed by a detail of work being conducted in relevant areas such as design, animation, robotics. The intent of this review as to inform the author on overlapping concepts in fields related to affective design in order to apply these concepts interactive character development.A three-year embargo was granted for this item

    Multimodal agents for cooperative interaction

    Get PDF
    2020 Fall.Includes bibliographical references.Embodied virtual agents offer the potential to interact with a computer in a more natural manner, similar to how we interact with other people. To reach this potential requires multimodal interaction, including both speech and gesture. This project builds on earlier work at Colorado State University and Brandeis University on just such a multimodal system, referred to as Diana. I designed and developed a new software architecture to directly address some of the difficulties of the earlier system, particularly with regard to asynchronous communication, e.g., interrupting the agent after it has begun to act. Various other enhancements were made to the agent systems, including the model itself, as well as speech recognition, speech synthesis, motor control, and gaze control. Further refactoring and new code were developed to achieve software engineering goals that are not outwardly visible, but no less important: decoupling, testability, improved networking, and independence from a particular agent model. This work, combined with the effort of others in the lab, has produced a "version 2'' Diana system that is well positioned to serve the lab's research needs in the future. In addition, in order to pursue new research opportunities related to developmental and intervention science, a "Faelyn Fox'' agent was developed. This is a different model, with a simplified cognitive architecture, and a system for defining an experimental protocol (for example, a toy-sorting task) based on Unity's visual state machine editor. This version too lays a solid foundation for future research

    Prosody-Based Adaptive Metaphoric Head and Arm Gestures Synthesis in Human Robot Interaction

    Get PDF
    International audienceIn human-human interaction, the process of communication can be established through three modalities: verbal, non-verbal (i.e., gestures), and/or para-verbal (i.e., prosody). The linguistic literature shows that the para-verbal and non-verbal cues are naturally aligned and synchronized, however the natural mechanism of this synchronization is still unexplored. The difficulty encountered during the coordination between prosody and metaphoric head-arm gestures concerns the conveyed meaning , the way of performing gestures with respect to prosodic characteristics, their relative temporal arrangement, and their coordinated organization in the phrasal structure of utterance. In this research, we focus on the mechanism of mapping between head-arm gestures and speech prosodic characteristics in order to generate an adaptive robot behavior to the interacting human's emotional state. Prosody patterns and the motion curves of head-arm gestures are aligned separately into parallel Hidden Markov Models (HMM). The mapping between speech and head-arm gestures is based on the Coupled Hidden Markov Models (CHMM), which could be seen as a multi-stream collection of HMM, characterizing the segmented prosody and head-arm gestures' data. An emotional state based audio-video database has been created for the validation of this study. The obtained results show the effectiveness of the proposed methodology

    Social network extraction and analysis based on multimodal dyadic interaction

    Get PDF
    Social interactions are a very important component in people"s lives. Social network analysis has become a common technique used to model and quantify the properties of social interactions. In this paper, we propose an integrated framework to explore the characteristics of a social network extracted from multimodal dyadic interactions. For our study, we used a set of videos belonging to New York Times" Blogging Heads opinion blog. The Social Network is represented as an oriented graph, whose directed links are determined by the Influence Model. The links" weights are a measure of the"influence" a person has over the other. The states of the Influence Model encode automatically extracted audio/visual features from our videos using state-of-the art algorithms. Our results are reported in terms of accuracy of audio/visual data fusion for speaker segmentation and centrality measures used to characterize the extracted social network
    corecore