83,620 research outputs found

    Design and Experimental Evaluation of a Context-aware Social Gaze Control System for a Humanlike Robot

    Get PDF
    Nowadays, social robots are increasingly being developed for a variety of human-centered scenarios in which they interact with people. For this reason, they should possess the ability to perceive and interpret human non-verbal/verbal communicative cues, in a humanlike way. In addition, they should be able to autonomously identify the most important interactional target at the proper time by exploring the perceptual information, and exhibit a believable behavior accordingly. Employing a social robot with such capabilities has several positive outcomes for human society. This thesis presents a multilayer context-aware gaze control system that has been implemented as a part of a humanlike social robot. Using this system the robot is able to mimic the human perception, attention, and gaze behavior in a dynamic multiparty social interaction. The system enables the robot to direct appropriately its gaze at the right time to the environmental targets and humans who are interacting with each other and with the robot. For this reason, the attention mechanism of the gaze control system is based on features that have been proven to guide human attention: the verbal and non-verbal cues, proxemics, the effective field of view, the habituation effect, and the low-level visual features. The gaze control system uses skeleton tracking and speech recognition,facial expression recognition, and salience detection to implement the same features. As part of a pilot evaluation, the gaze behavior of 11 participants was collected with a professional eye-tracking device, while they were watching a video of two-person interactions. Analyzing the average gaze behavior of participants, the importance of human-relevant features in human attention triggering were determined. Based on this finding, the parameters of the gaze control system were tuned in order to imitate the human behavior in selecting features of environment. The comparison between the human gaze behavior and the gaze behavior of the developed system running on the same videos shows that the proposed approach is promising as it replicated human gaze behavior 89% of the time

    Explorations in engagement for humans and robots

    Get PDF
    This paper explores the concept of engagement, the process by which individuals in an interaction start, maintain and end their perceived connection to one another. The paper reports on one aspect of engagement among human interactors--the effect of tracking faces during an interaction. It also describes the architecture of a robot that can participate in conversational, collaborative interactions with engagement gestures. Finally, the paper reports on findings of experiments with human participants who interacted with a robot when it either performed or did not perform engagement gestures. Results of the human-robot studies indicate that people become engaged with robots: they direct their attention to the robot more often in interactions where engagement gestures are present, and they find interactions more appropriate when engagement gestures are present than when they are not.Comment: 31 pages, 5 figures, 3 table

    Tracking Gaze and Visual Focus of Attention of People Involved in Social Interaction

    Get PDF
    The visual focus of attention (VFOA) has been recognized as a prominent conversational cue. We are interested in estimating and tracking the VFOAs associated with multi-party social interactions. We note that in this type of situations the participants either look at each other or at an object of interest; therefore their eyes are not always visible. Consequently both gaze and VFOA estimation cannot be based on eye detection and tracking. We propose a method that exploits the correlation between eye gaze and head movements. Both VFOA and gaze are modeled as latent variables in a Bayesian switching state-space model. The proposed formulation leads to a tractable learning procedure and to an efficient algorithm that simultaneously tracks gaze and visual focus. The method is tested and benchmarked using two publicly available datasets that contain typical multi-party human-robot and human-human interactions.Comment: 15 pages, 8 figures, 6 table

    A Review of Verbal and Non-Verbal Human-Robot Interactive Communication

    Get PDF
    In this paper, an overview of human-robot interactive communication is presented, covering verbal as well as non-verbal aspects of human-robot interaction. Following a historical introduction, and motivation towards fluid human-robot communication, ten desiderata are proposed, which provide an organizational axis both of recent as well as of future research on human-robot communication. Then, the ten desiderata are examined in detail, culminating to a unifying discussion, and a forward-looking conclusion

    Entity Recognition at First Sight: Improving NER with Eye Movement Information

    Full text link
    Previous research shows that eye-tracking data contains information about the lexical and syntactic properties of text, which can be used to improve natural language processing models. In this work, we leverage eye movement features from three corpora with recorded gaze information to augment a state-of-the-art neural model for named entity recognition (NER) with gaze embeddings. These corpora were manually annotated with named entity labels. Moreover, we show how gaze features, generalized on word type level, eliminate the need for recorded eye-tracking data at test time. The gaze-augmented models for NER using token-level and type-level features outperform the baselines. We present the benefits of eye-tracking features by evaluating the NER models on both individual datasets as well as in cross-domain settings.Comment: Accepted at NAACL-HLT 201

    Pointing as an Instrumental Gesture : Gaze Representation Through Indication

    Get PDF
    The research of the first author was supported by a Fulbright Visiting Scholar Fellowship and developed in 2012 during a period of research visit at the University of Memphis.Peer reviewedPublisher PD

    Looking Beyond a Clever Narrative: Visual Context and Attention are Primary Drivers of Affect in Video Advertisements

    Full text link
    Emotion evoked by an advertisement plays a key role in influencing brand recall and eventual consumer choices. Automatic ad affect recognition has several useful applications. However, the use of content-based feature representations does not give insights into how affect is modulated by aspects such as the ad scene setting, salient object attributes and their interactions. Neither do such approaches inform us on how humans prioritize visual information for ad understanding. Our work addresses these lacunae by decomposing video content into detected objects, coarse scene structure, object statistics and actively attended objects identified via eye-gaze. We measure the importance of each of these information channels by systematically incorporating related information into ad affect prediction models. Contrary to the popular notion that ad affect hinges on the narrative and the clever use of linguistic and social cues, we find that actively attended objects and the coarse scene structure better encode affective information as compared to individual scene objects or conspicuous background elements.Comment: Accepted for publication in the Proceedings of 20th ACM International Conference on Multimodal Interaction, Boulder, CO, US

    A comparison of addressee detection methods for multiparty conversations

    Get PDF
    Several algorithms have recently been proposed for recognizing addressees in a group conversational setting. These algorithms can rely on a variety of factors including previous conversational roles, gaze and type of dialogue act. Both statistical supervised machine learning algorithms as well as rule based methods have been developed. In this paper, we compare several algorithms developed for several different genres of muliparty dialogue, and propose a new synthesis algorithm that matches the performance of machine learning algorithms while maintaning the transparancy of semantically meaningfull rule-based algorithms

    Speech-Gesture Mapping and Engagement Evaluation in Human Robot Interaction

    Full text link
    A robot needs contextual awareness, effective speech production and complementing non-verbal gestures for successful communication in society. In this paper, we present our end-to-end system that tries to enhance the effectiveness of non-verbal gestures. For achieving this, we identified prominently used gestures in performances by TED speakers and mapped them to their corresponding speech context and modulated speech based upon the attention of the listener. The proposed method utilized Convolutional Pose Machine [4] to detect the human gesture. Dominant gestures of TED speakers were used for learning the gesture-to-speech mapping. The speeches by them were used for training the model. We also evaluated the engagement of the robot with people by conducting a social survey. The effectiveness of the performance was monitored by the robot and it self-improvised its speech pattern on the basis of the attention level of the audience, which was calculated using visual feedback from the camera. The effectiveness of interaction as well as the decisions made during improvisation was further evaluated based on the head-pose detection and interaction survey.Comment: 8 pages, 9 figures, Under review in IRC 201
    corecore