Search CORE

1,591 research outputs found

Spotting Agreement and Disagreement: A Survey of Nonverbal Audiovisual Cues and Tools

Author: Bousmalis Konstantinos
Mehu Marc
Pantic Maja
Publication venue: IEEE Computer Society Press
Publication date: 01/01/2009
Field of study

While detecting and interpreting temporal patterns of non–verbal behavioral cues in a given context is a natural and often unconscious process for humans, it remains a rather difficult task for computer systems. Nevertheless, it is an important one to achieve if the goal is to realise a naturalistic communication between humans and machines. Machines that are able to sense social attitudes like agreement and disagreement and respond to them in a meaningful way are likely to be welcomed by users due to the more natural, efficient and human–centered interaction they are bound to experience. This paper surveys the nonverbal cues that could be present during agreement and disagreement behavioural displays and lists a number of tools that could be useful in detecting them, as well as a few publicly available databases that could be used to train these tools for analysis of spontaneous, audiovisual instances of agreement and disagreement

CiteSeerX

University of Twente Research Information

French Face-to-Face Interaction: Repetition as a Multimodal Resource

Author: Bertrand Roxane
Ferré Gaëlle
Guardiola Mathilde
Publication venue: Science Publishers/CRC Press
Publication date: 01/01/2013
Field of study

International audienceIn this chapter, after presenting the corpus as well as some of theannotations developed in the OTIM project, we then focus on the specificphenomenon of repetition. After briefly discussing this notion, we showthat different degrees of convergence can be achieved by speakersdepending on the multimodal complexity of the repetition and on thetiming in between the repeated element and the model. Although we focusmore specifically on the gestural level, we present a multimodal analysis ofgestural repetitions in which we met several issues linked to multimodalannotations of any type. This gives an overview of crucial issues in crosslevellinguistic annotation, such as the definition of a phenomenonincluding formal and/or functional categorization

HAL AMU

Prosody and Kinesics Based Co-analysis Towards Continuous Gesture Recognition

Author: Subedi Pratiksha
Publication venue: University of Memphis Digital Commons
Publication date: 27/11/2012
Field of study

The aim of this study is to develop a multimodal co-analysis framework for continuous gesture recognition by exploiting prosodic and kinesics manifestation of natural communication. Using this framework, a co-analysis pattern between correlating components is obtained. The co-analysis pattern is clustered using K-means clustering to determine how well the pattern distinguishes the gestures. Features of the proposed approach that differentiate it from the other models are its less susceptibility to idiosyncrasies, its scalability, and simplicity. The experiment was performed on Multimodal Annotated Gesture Corpus (MAGEC) that we created for research on understanding non-verbal communication community, particularly the gestures

University of Memphis Digital Commons

Factors of Emotion and Affect in Designing Interactive Virtual Characters

Author: Maicher Kellen
Publication venue
Publication date: 01/03/2013
Field of study

The Arts: 1st Place (The Ohio State University Edward F. Hayes Graduate Research Forum)This paper represents a review of literature concerning factors of affective interactive virtual character design. Affect and it's related concepts are defined followed by a detail of work being conducted in relevant areas such as design, animation, robotics. The intent of this review as to inform the author on overlapping concepts in fields related to affective design in order to apply these concepts interactive character development.A three-year embargo was granted for this item

KnowledgeBank at OSU

Multimodal agents for cooperative interaction

Author: Strout Joseph J.
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2020
Field of study

2020 Fall.Includes bibliographical references.Embodied virtual agents offer the potential to interact with a computer in a more natural manner, similar to how we interact with other people. To reach this potential requires multimodal interaction, including both speech and gesture. This project builds on earlier work at Colorado State University and Brandeis University on just such a multimodal system, referred to as Diana. I designed and developed a new software architecture to directly address some of the difficulties of the earlier system, particularly with regard to asynchronous communication, e.g., interrupting the agent after it has begun to act. Various other enhancements were made to the agent systems, including the model itself, as well as speech recognition, speech synthesis, motor control, and gaze control. Further refactoring and new code were developed to achieve software engineering goals that are not outwardly visible, but no less important: decoupling, testability, improved networking, and independence from a particular agent model. This work, combined with the effort of others in the lab, has produced a "version 2'' Diana system that is well positioned to serve the lab's research needs in the future. In addition, in order to pursue new research opportunities related to developmental and intervention science, a "Faelyn Fox'' agent was developed. This is a different model, with a simplified cognitive architecture, and a system for defining an experimental protocol (for example, a toy-sorting task) based on Unity's visual state machine editor. This version too lays a solid foundation for future research

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Prosody-Based Adaptive Metaphoric Head and Arm Gestures Synthesis in Human Robot Interaction

Author: ALY Amir
Tapus Adriana
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/11/2013
Field of study

International audienceIn human-human interaction, the process of communication can be established through three modalities: verbal, non-verbal (i.e., gestures), and/or para-verbal (i.e., prosody). The linguistic literature shows that the para-verbal and non-verbal cues are naturally aligned and synchronized, however the natural mechanism of this synchronization is still unexplored. The difficulty encountered during the coordination between prosody and metaphoric head-arm gestures concerns the conveyed meaning , the way of performing gestures with respect to prosodic characteristics, their relative temporal arrangement, and their coordinated organization in the phrasal structure of utterance. In this research, we focus on the mechanism of mapping between head-arm gestures and speech prosodic characteristics in order to generate an adaptive robot behavior to the interacting human's emotional state. Prosody patterns and the motion curves of head-arm gestures are aligned separately into parallel Hidden Markov Models (HMM). The mapping between speech and head-arm gestures is based on the Coupled Hidden Markov Models (CHMM), which could be seen as a multi-stream collection of HMM, characterizing the segmented prosody and head-arm gestures' data. An emotional state based audio-video database has been created for the validation of this study. The obtained results show the effectiveness of the proposed methodology

Crossref

Social network extraction and analysis based on multimodal dyadic interaction

Author: Ahuja
Alon
Ambady
Basu
Benevenuto
Bogdan Raducanu
Borgatti
Chau
Demsar
Geronimo
Jones
Jordi Vitrià
Kruskall
Moreno
Pentland
Petia Radeva
Rabiner
Salamin
Scott
Sergio Escalera
Sparrowe
Stuart
Vinciarelli
Wasserman
Weber
Xavier Baró
Xu
Xu
Publication venue: 'MDPI AG'
Publication date: 01/01/2012
Field of study

Social interactions are a very important component in people"s lives. Social network analysis has become a common technique used to model and quantify the properties of social interactions. In this paper, we propose an integrated framework to explore the characteristics of a social network extracted from multimodal dyadic interactions. For our study, we used a set of videos belonging to New York Times" Blogging Heads opinion blog. The Social Network is represented as an oriented graph, whose directed links are determined by the Inﬂuence Model. The links" weights are a measure of the"inﬂuence" a person has over the other. The states of the Inﬂuence Model encode automatically extracted audio/visual features from our videos using state-of-the art algorithms. Our results are reported in terms of accuracy of audio/visual data fusion for speaker segmentation and centrality measures used to characterize the extracted social network

Multidisciplinary Digital Publishing Institute

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

PubMed Central

Diposit Digital de Documents de la UAB

The Oberta in open access

Diposit Digital de la Universitat de Barcelona