Visual attention is highly fragmented during mobile interactions, but the
erratic nature of attention shifts currently limits attentive user interfaces
to adapting after the fact, i.e. after shifts have already happened. We instead
study attention forecasting -- the challenging task of predicting users' gaze
behaviour (overt visual attention) in the near future. We present a novel
long-term dataset of everyday mobile phone interactions, continuously recorded
from 20 participants engaged in common activities on a university campus over
4.5 hours each (more than 90 hours in total). We propose a proof-of-concept
method that uses device-integrated sensors and body-worn cameras to encode rich
information on device usage and users' visual scene. We demonstrate that our
method can forecast bidirectional attention shifts and predict whether the
primary attentional focus is on the handheld mobile device. We study the impact
of different feature sets on performance and discuss the significant potential
but also remaining challenges of forecasting user attention during mobile
interactions.Comment: 13 pages, 9 figure