Search CORE

19 research outputs found

Visual focus of attention estimation using eye center localization

Author: Cai Haibin
Liu Bangli
Zhang Jianhua
Chen S. Y.
Liu Honghai
Publication venue
Publication date: 07/07/2015
Field of study

SUPPLEMENTARY INFORMATION

Dryad Digital Repository (Duke University)

Portsmouth University Research Portal (Pure)

FigShare

Visual focus of attention estimation using eye center localization

Author: Bangli Liu (7437341)
Haibin Cai (5153423)
Honghai Liu (281911)
Jianhua Zhang (6347)
Shengyong Chen (405461)
Publication venue
Publication date: 07/07/2015
Field of study

Estimating people visual focus of attention (VFOA) plays a crucial role in various practical systems such as human-robot interaction. It is challenging to extract the cue of the VFOA of a person due to the difficulty of recognizing gaze directionality. In this paper, we propose an improved integrodifferential approach to represent gaze via efficiently and accurately localizing the eye center in lower resolution image. The proposed method takes advantage of the drastic intensity changes between the iris and the sclera and the grayscale of the eye center as well. The number of kernels is optimized to convolute the original eye region image, and the eye center is located via searching the maximum ratio derivative of the neighbor curve magnitudes in the convolution image. Experimental results confirm that the algorithm outperforms the state-of-the-art methods in terms of computational cost, accuracy, and robustness to illumination changes

Loughborough University Institutional Repository

Idnetifying the adressee in Human-Human-Robot Interactions based on head pose and speech

Author: Katzenmaier Michael
Schultz Tanja
Stiefelhagen Rainer
Publication venue: New York (NY)
Publication date: 01/01/2004
Field of study

KITopen

Addressee detection for dialog systems using temporal and spectral dimensions of speaking style,” in

Author: Andreas Stolcke
Elizabeth Shriberg
Suman Ravuri
Publication venue
Publication date: 01/01/2013
Field of study

Abstract As dialog systems evolve to handle unconstrained input and for use in open environments, addressee detection (detecting speech to the system versus to other people) becomes an increasingly important challenge. We study a corpus in which speakers talk both to a system and to each other, and model two dimensions of speaking style that talkers modify when changing addressee: speech rhythm and vocal effort. For each dimension we design features that do not require speech recognition output, session normalization, speaker normalization, or dialog context. Detection experiments show that rhythm and effort features are complementary, outperform lexical models based on recognized words, and reduce error rates even if word recognition is error-free. Simulated online processing experiments show that all features need only the first couple seconds of speech. Finally, we find that temporal and spectral stylistic models can be trained on outside corpora, such as ATIS and ICSI meetings, with reasonable generalization to the target task, thus showing promise for domain-independent computerversus-human addressee detectors

CiteSeerX

A corpus for studying addressing behaviour in multi-party dialogues

Author: Jovanovic N.
Nijholt Antinus
op den Akker Hendrikus J.A.
Publication venue
Publication date: 01/01/2006
Field of study

This paper describes a multi-modal corpus of hand-annotated meeting dialogues that was designed for studying addressing behaviour in face-to-face conversations. The corpus contains annotated dialogue acts, addressees, adjacency pairs and gaze direction. First, we describe the corpus design where we present the meetings collection, annotation scheme and annotation tools. Then, we present the\ud analysis of the reproducibility and stability of the annotation scheme

University of Twente Research Information

Analyzing Group Interactions in Conversations: a Review

Author: Gatica-Perez Daniel
Publication venue
Publication date: 11/02/2010
Field of study

\noindent Multiparty face-to-face conversations in professional and social settings represent an emerging research domain for which automatic activity-based analysis is relevant for scientific and practical reasons. The activity patterns emerging from groups engaged in conversations are intrinsically multimodal and thus constitute interesting target problems for multistream and multisensor fusion techniques. In this paper, a summarized review of the literature on automatic analysis of group activities in face-to-face conversational settings is presented. A basic categorization of group activities is proposed based on their typical temporal scale, and existing works are then discussed for various types of activities and trends including addressing, turn taking, interest, and dominance

Infoscience - École polytechnique fédérale de Lausanne

Saliency-based identification and recognition of pointed-at objects

Author: Boris Schauerte
Gernot A. Fink
Jan Richarz
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Abstract — When persons interact, non-verbal cues are used to direct the attention of persons towards objects of interest. Achieving joint attention this way is an important aspect of natural communication. Most importantly, it allows to couple verbal descriptions with the visual appearance of objects, if the referred-to object is non-verbally indicated. In this contri-bution, we present a system that utilizes bottom-up saliency and pointing gestures to efficiently identify pointed-at objects. Furthermore, the system focuses the visual attention by steering a pan-tilt-zoom camera towards the object of interest and thus provides a suitable model-view for SIFT-based recognition and learning. We demonstrate the practical applicability of the proposed system through experimental evaluation in different environments with multiple pointers and objects

CiteSeerX

Crossref

A Review of Verbal and Non-Verbal Human-Robot Interactive Communication

Author: Mavridis Nikolaos
Publication venue
Publication date: 20/01/2014
Field of study

In this paper, an overview of human-robot interactive communication is presented, covering verbal as well as non-verbal aspects of human-robot interaction. Following a historical introduction, and motivation towards fluid human-robot communication, ten desiderata are proposed, which provide an organizational axis both of recent as well as of future research on human-robot communication. Then, the ten desiderata are examined in detail, culminating to a unifying discussion, and a forward-looking conclusion

arXiv.org e-Print Archive

Elsevier - Publisher Connector

Tracking the visual focus of attention for a varying number of wandering people

Author: Ba Silèye O.
Gatica-Perez Daniel
Odobez Jean-Marc
Smith Kevin C.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/02/2010
Field of study

In this article, we define and address the problem of finding the visual focus of attention for a varying number of wandering people (VFOA-W) -- determining where a person is looking when their movement is unconstrained. VFOA-W estimation is a new and important problem with implications in behavior understanding and cognitive science, as well as real-world applications. One such application, presented in this article, monitors the attention passers-by pay to an outdoor advertisement using a single video camera. In our approach to the VFOA-W problem, we propose a multi-person tracking solution based on a dynamic Bayesian network that simultaneously infers the number of people in a scene, their body locations, their head locations, and their head pose. For efficient inference in the resulting variable-dimensional state-space we propose a Reversible Jump Markov Chain Monte Carlo (RJMCMC) sampling scheme, as well as a novel global observation model which determines the number of people in the scene and their locations. To determine if a person is looking at the advertisement or not, we propose a Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM)-based VFOA-W model which uses head pose and location information. Our models are evaluated for tracking performance and ability to recognize people looking at an outdoor advertisement, with results indicating good performance on sequences where up to three people pass in front of an advertisement

Infoscience - École polytechnique fédérale de Lausanne