7,666 research outputs found
Tracking Gaze and Visual Focus of Attention of People Involved in Social Interaction
The visual focus of attention (VFOA) has been recognized as a prominent
conversational cue. We are interested in estimating and tracking the VFOAs
associated with multi-party social interactions. We note that in this type of
situations the participants either look at each other or at an object of
interest; therefore their eyes are not always visible. Consequently both gaze
and VFOA estimation cannot be based on eye detection and tracking. We propose a
method that exploits the correlation between eye gaze and head movements. Both
VFOA and gaze are modeled as latent variables in a Bayesian switching
state-space model. The proposed formulation leads to a tractable learning
procedure and to an efficient algorithm that simultaneously tracks gaze and
visual focus. The method is tested and benchmarked using two publicly available
datasets that contain typical multi-party human-robot and human-human
interactions.Comment: 15 pages, 8 figures, 6 table
Recommended from our members
Information acquisition using eye-gaze tracking for person-following with mobile robots
In the effort of developing natural means for human-robot interaction (HRI), signifcant amount of research has been focusing on Person-Following (PF) for mobile robots. PF, which generally consists of detecting, recognizing and following people, is believed to be one of the required functionalities for most future robots that share their environments with their human companions. Research in this field is mostly directed towards fully automating this functionality, which makes the challenge even more tedious. Focusing on this challenge leads research to divert from other challenges that coexist in any PF system. A natural PF functionality consists of a number of tasks that are required to be implemented in the system. However, in more realistic life scenarios, not all the tasks required for PF need to be automated. Instead, some of these tasks can be operated by human operators and therefore require natural means of interaction and information acquisition. In order to highlight all the tasks that are believed to exist in any PF system, this paper introduces a novel taxonomy for PF. Also, in order to provide a natural means for HRI, TeleGaze is used for information acquisition in the implementation of the taxonomy. TeleGaze was previously developed by the authors as a means of natural HRI for teleoperation through eye-gaze tracking. Using TeleGaze in the aid of developing PF systems is believed to show the feasibility of achieving a realistic information acquisition in a natural way
Towards Active Event Recognition
Directing robot attention to recognise activities and to anticipate events like goal-directed actions is a crucial skill for human-robot interaction. Unfortunately, issues like intrinsic time constraints, the spatially distributed nature of the entailed information sources, and the existence of a multitude of unobservable states affecting the system, like latent intentions, have long rendered achievement of such skills a rather elusive goal. The problem tests the limits of current attention control systems. It requires an integrated solution for tracking, exploration and recognition, which traditionally have been seen as separate problems in active vision.We propose a probabilistic generative framework based on a mixture of Kalman filters and information gain maximisation that uses predictions in both recognition and attention-control. This framework can efficiently use the observations of one element in a dynamic environment to provide information on other elements, and consequently enables guided exploration.Interestingly, the sensors-control policy, directly derived from first principles, represents the intuitive trade-off between finding the most discriminative clues and maintaining overall awareness.Experiments on a simulated humanoid robot observing a human executing goal-oriented actions demonstrated improvement on recognition time and precision over baseline systems
Gaze-based teleprosthetic enables intuitive continuous control of complex robot arm use: Writing & drawing
Eye tracking is a powerful mean for assistive technologies for people with movement disorders, paralysis and amputees. We present a highly intuitive eye tracking-controlled robot arm operating in 3-dimensional space based on the user's gaze target point that enables tele-writing and drawing. The usability and intuitive usage was assessed by a “tele” writing experiment with 8 subjects that learned to operate the system within minutes of first time use. These subjects were naive to the system and the task and had to write three letters on a white board with a white board pen attached to the robot arm's endpoint. The instructions are to imagine they were writing text with the pen and look where the pen would be going, they had to write the letters as fast and as accurate as possible, given a letter size template. Subjects were able to perform the task with facility and accuracy, and movements of the arm did not interfere with subjects ability to control their visual attention so as to enable smooth writing. On the basis of five consecutive trials there was a significant decrease in the total time used and the total number of commands sent to move the robot arm from the first to the second trial but no further improvement thereafter, suggesting that within writing 6 letters subjects had mastered the ability to control the system. Our work demonstrates that eye tracking is a powerful means to control robot arms in closed-loop and real-time, outperforming other invasive and non-invasive approaches to Brain-Machine-Interfaces in terms of calibration time (<;2 minutes), training time (<;10 minutes), interface technology costs. We suggests that gaze-based decoding of action intention may well become one of the most efficient ways to interface with robotic actuators - i.e. Brain-Robot-Interfaces - and become useful beyond paralysed and amputee users also for the general teleoperation of robotic and exoskeleton in human augmentation
GazeDrone: Mobile Eye-Based Interaction in Public Space Without Augmenting the User
Gaze interaction holds a lot of promise for seamless human-computer interaction. At the same time, current wearable mobile eye trackers require user augmentation that negatively impacts natural user behavior while remote trackers require users to position themselves within a confined tracking range. We present GazeDrone, the first system that combines a camera-equipped aerial drone with a computational method to detect sidelong glances for spontaneous (calibration-free) gaze-based interaction with surrounding pervasive systems (e.g., public displays). GazeDrone does not require augmenting each user with on-body sensors and allows interaction from arbitrary positions, even while moving. We demonstrate that drone-supported gaze interaction is feasible and accurate for certain movement types. It is well-perceived by users, in particular while interacting from a fixed position as well as while moving orthogonally or diagonally to a display. We present design implications and discuss opportunities and challenges for drone-supported gaze interaction in public
Explorations in engagement for humans and robots
This paper explores the concept of engagement, the process by which
individuals in an interaction start, maintain and end their perceived
connection to one another. The paper reports on one aspect of engagement among
human interactors--the effect of tracking faces during an interaction. It also
describes the architecture of a robot that can participate in conversational,
collaborative interactions with engagement gestures. Finally, the paper reports
on findings of experiments with human participants who interacted with a robot
when it either performed or did not perform engagement gestures. Results of the
human-robot studies indicate that people become engaged with robots: they
direct their attention to the robot more often in interactions where engagement
gestures are present, and they find interactions more appropriate when
engagement gestures are present than when they are not.Comment: 31 pages, 5 figures, 3 table
Neural Network Based Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction
This paper introduces a novel neural network-based reinforcement learning
approach for robot gaze control. Our approach enables a robot to learn and to
adapt its gaze control strategy for human-robot interaction neither with the
use of external sensors nor with human supervision. The robot learns to focus
its attention onto groups of people from its own audio-visual experiences,
independently of the number of people, of their positions and of their physical
appearances. In particular, we use a recurrent neural network architecture in
combination with Q-learning to find an optimal action-selection policy; we
pre-train the network using a simulated environment that mimics realistic
scenarios that involve speaking/silent participants, thus avoiding the need of
tedious sessions of a robot interacting with people. Our experimental
evaluation suggests that the proposed method is robust against parameter
estimation, i.e. the parameter values yielded by the method do not have a
decisive impact on the performance. The best results are obtained when both
audio and visual information is jointly used. Experiments with the Nao robot
indicate that our framework is a step forward towards the autonomous learning
of socially acceptable gaze behavior.Comment: Paper submitted to Pattern Recognition Letter
- …