Search CORE

83,620 research outputs found

Design and Experimental Evaluation of a Context-aware Social Gaze Control System for a Humanlike Robot

Author: ZARAKI ABOLFAZL
Publication venue: 'Pisa University Press'
Publication date: 03/03/2014
Field of study

Nowadays, social robots are increasingly being developed for a variety of human-centered scenarios in which they interact with people. For this reason, they should possess the ability to perceive and interpret human non-verbal/verbal communicative cues, in a humanlike way. In addition, they should be able to autonomously identify the most important interactional target at the proper time by exploring the perceptual information, and exhibit a believable behavior accordingly. Employing a social robot with such capabilities has several positive outcomes for human society. This thesis presents a multilayer context-aware gaze control system that has been implemented as a part of a humanlike social robot. Using this system the robot is able to mimic the human perception, attention, and gaze behavior in a dynamic multiparty social interaction. The system enables the robot to direct appropriately its gaze at the right time to the environmental targets and humans who are interacting with each other and with the robot. For this reason, the attention mechanism of the gaze control system is based on features that have been proven to guide human attention: the verbal and non-verbal cues, proxemics, the effective field of view, the habituation effect, and the low-level visual features. The gaze control system uses skeleton tracking and speech recognition,facial expression recognition, and salience detection to implement the same features. As part of a pilot evaluation, the gaze behavior of 11 participants was collected with a professional eye-tracking device, while they were watching a video of two-person interactions. Analyzing the average gaze behavior of participants, the importance of human-relevant features in human attention triggering were determined. Based on this finding, the parameters of the gaze control system were tuned in order to imitate the human behavior in selecting features of environment. The comparison between the human gaze behavior and the gaze behavior of the developed system running on the same videos shows that the proposed approach is promising as it replicated human gaze behavior 89% of the time

Electronic Thesis and Dissertation Archive - Università di Pisa

Explorations in engagement for humans and robots

Author: Kidd Cory
Lee Christopher
Lesh Neal
Rich Charles
Sidner Candace L.
Publication venue
Publication date: 01/01/2005
Field of study

This paper explores the concept of engagement, the process by which individuals in an interaction start, maintain and end their perceived connection to one another. The paper reports on one aspect of engagement among human interactors--the effect of tracking faces during an interaction. It also describes the architecture of a robot that can participate in conversational, collaborative interactions with engagement gestures. Finally, the paper reports on findings of experiments with human participants who interacted with a robot when it either performed or did not perform engagement gestures. Results of the human-robot studies indicate that people become engaged with robots: they direct their attention to the robot more often in interactions where engagement gestures are present, and they find interactions more appropriate when engagement gestures are present than when they are not.Comment: 31 pages, 5 figures, 3 table

arXiv.org e-Print Archive

CiteSeerX

Elsevier - Publisher Connector

Tracking Gaze and Visual Focus of Attention of People Involved in Social Interaction

Author: Ba Silèye
Horaud Radu
Massé Benoît
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/11/2017
Field of study

The visual focus of attention (VFOA) has been recognized as a prominent conversational cue. We are interested in estimating and tracking the VFOAs associated with multi-party social interactions. We note that in this type of situations the participants either look at each other or at an object of interest; therefore their eyes are not always visible. Consequently both gaze and VFOA estimation cannot be based on eye detection and tracking. We propose a method that exploits the correlation between eye gaze and head movements. Both VFOA and gaze are modeled as latent variables in a Bayesian switching state-space model. The proposed formulation leads to a tractable learning procedure and to an efficient algorithm that simultaneously tracks gaze and visual focus. The method is tested and benchmarked using two publicly available datasets that contain typical multi-party human-robot and human-human interactions.Comment: 15 pages, 8 figures, 6 table

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

A Review of Verbal and Non-Verbal Human-Robot Interactive Communication

Author: Mavridis Nikolaos
Publication venue
Publication date: 20/01/2014
Field of study

In this paper, an overview of human-robot interactive communication is presented, covering verbal as well as non-verbal aspects of human-robot interaction. Following a historical introduction, and motivation towards fluid human-robot communication, ten desiderata are proposed, which provide an organizational axis both of recent as well as of future research on human-robot communication. Then, the ten desiderata are examined in detail, culminating to a unifying discussion, and a forward-looking conclusion

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Entity Recognition at First Sight: Improving NER with Eye Movement Information

Author: Hollenstein Nora
Zhang Ce
Publication venue
Publication date: 01/01/2019
Field of study

Previous research shows that eye-tracking data contains information about the lexical and syntactic properties of text, which can be used to improve natural language processing models. In this work, we leverage eye movement features from three corpora with recorded gaze information to augment a state-of-the-art neural model for named entity recognition (NER) with gaze embeddings. These corpora were manually annotated with named entity labels. Moreover, we show how gaze features, generalized on word type level, eliminate the need for recorded eye-tracking data at test time. The gaze-augmented models for NER using token-level and type-level features outperform the baselines. We present the benefits of eye-tracking features by evaluating the NER models on both individual datasets as well as in cross-domain settings.Comment: Accepted at NAACL-HLT 201

arXiv.org e-Print Archive

Repository for Publications and Research Data

Pointing as an Instrumental Gesture : Gaze Representation Through Indication

Author: Cappuccio Massimiliano L.
Chu Mingyuan
Kita Sotaro
Publication venue
Publication date: 01/01/2013
Field of study

The research of the first author was supported by a Fulbright Visiting Scholar Fellowship and developed in 2012 during a period of research visit at the University of Memphis.Peer reviewedPublisher PD

Aberdeen University Research

UNSWorks

MPG.PuRe

Looking Beyond a Clever Narrative: Visual Context and Attention are Primary Drivers of Affect in Video Advertisements

Author: Kankanhalli Mohan
Katti Harish
Shukla Abhinav
Subramanian Ramanathan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/08/2018
Field of study

Emotion evoked by an advertisement plays a key role in influencing brand recall and eventual consumer choices. Automatic ad affect recognition has several useful applications. However, the use of content-based feature representations does not give insights into how affect is modulated by aspects such as the ad scene setting, salient object attributes and their interactions. Neither do such approaches inform us on how humans prioritize visual information for ad understanding. Our work addresses these lacunae by decomposing video content into detected objects, coarse scene structure, object statistics and actively attended objects identified via eye-gaze. We measure the importance of each of these information channels by systematically incorporating related information into ad affect prediction models. Contrary to the popular notion that ad affect hinges on the narrative and the clever use of linguistic and social cues, we find that actively attended objects and the coarse scene structure better encode affective information as compared to individual scene objects or conspicuous background elements.Comment: Accepted for publication in the Proceedings of 20th ACM International Conference on Multimodal Interaction, Boulder, CO, US

arXiv.org e-Print Archive

University of Canberra Research Repository

Open Access Repository of IISc Research Publications

A comparison of addressee detection methods for multiparty conversations

Author: Akker Rieks op den
Traum David
Publication venue: KTH Stockholm
Publication date: 01/01/2009
Field of study

Several algorithms have recently been proposed for recognizing addressees in a group conversational setting. These algorithms can rely on a variety of factors including previous conversational roles, gaze and type of dialogue act. Both statistical supervised machine learning algorithms as well as rule based methods have been developed. In this paper, we compare several algorithms developed for several different genres of muliparty dialogue, and propose a new synthesis algorithm that matches the performance of machine learning algorithms while maintaning the transparancy of semantically meaningfull rule-based algorithms

CiteSeerX

University of Twente Research Information

Speech-Gesture Mapping and Engagement Evaluation in Human Robot Interaction

Author: Dhall Abhinav
Ghosh Bishal
Singla Ekta
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/12/2018
Field of study

A robot needs contextual awareness, effective speech production and complementing non-verbal gestures for successful communication in society. In this paper, we present our end-to-end system that tries to enhance the effectiveness of non-verbal gestures. For achieving this, we identified prominently used gestures in performances by TED speakers and mapped them to their corresponding speech context and modulated speech based upon the attention of the listener. The proposed method utilized Convolutional Pose Machine [4] to detect the human gesture. Dominant gestures of TED speakers were used for learning the gesture-to-speech mapping. The speeches by them were used for training the model. We also evaluated the engagement of the robot with people by conducting a social survey. The effectiveness of the performance was monitored by the robot and it self-improvised its speech pattern on the basis of the attention level of the audience, which was calculated using visual feedback from the camera. The effectiveness of interaction as well as the decisions made during improvisation was further evaluated based on the head-pose detection and interaction survey.Comment: 8 pages, 9 figures, Under review in IRC 201

arXiv.org e-Print Archive

Crossref