Search CORE

4 research outputs found

Digging Deeper into Egocentric Gaze Prediction

Author: Borji Ali
Kannala Juho
Rahtu Esa
Tavakoli Hamed R.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/04/2019
Field of study

This paper digs deeper into factors that influence egocentric gaze. Instead of training deep models for this purpose in a blind manner, we propose to inspect factors that contribute to gaze guidance during daily tasks. Bottom-up saliency and optical flow are assessed versus strong spatial prior baselines. Task-specific cues such as vanishing point, manipulation point, and hand regions are analyzed as representatives of top-down information. We also look into the contribution of these factors by investigating a simple recurrent neural model for ego-centric gaze prediction. First, deep features are extracted for all input video frames. Then, a gated recurrent unit is employed to integrate information over time and to predict the next fixation. We also propose an integrated model that combines the recurrent model with several top-down and bottom-up cues. Extensive experiments over multiple datasets reveal that (1) spatial biases are strong in egocentric videos, (2) bottom-up saliency models perform poorly in predicting gaze and underperform spatial biases, (3) deep features perform better compared to traditional features, (4) as opposed to hand regions, the manipulation point is a strong influential cue for gaze prediction, (5) combining the proposed recurrent model with bottom-up cues, vanishing points and, in particular, manipulation point results in the best gaze prediction accuracy over egocentric videos, (6) the knowledge transfer works best for cases where the tasks or sequences are similar, and (7) task and activity recognition can benefit from gaze prediction. Our findings suggest that (1) there should be more emphasis on hand-object interaction and (2) the egocentric vision community should consider larger datasets including diverse stimuli and more subjects.Comment: presented at WACV 201

arXiv.org e-Print Archive

Crossref

Mutual Context Network for Jointly Estimating Egocentric Gaze and Actions

Author: Cai Minjie
Huang Yifei
Li Zhenqiang
Sato Yoichi
Publication venue
Publication date: 29/06/2020
Field of study

In this work, we address two coupled tasks of gaze prediction and action recognition in egocentric videos by exploring their mutual context. Our assumption is that in the procedure of performing a manipulation task, what a person is doing determines where the person is looking at, and the gaze point reveals gaze and non-gaze regions which contain important and complementary information about the undergoing action. We propose a novel mutual context network (MCN) that jointly learns action-dependent gaze prediction and gaze-guided action recognition in an end-to-end manner. Experiments on public egocentric video datasets demonstrate that our MCN achieves state-of-the-art performance of both gaze prediction and action recognition

arXiv.org e-Print Archive

Cone of Vision as a Behavioural Cue for VR Collaboration

Author: Alebri Muna
Bovo Riccardo
Costanza Enrico
Giunchi Daniele
Heinis Thomas
Steed Anthony
Publication venue: ACM Press
Publication date: 01/11/2022
Field of study

UCL Discovery