5,348 research outputs found
Are all the frames equally important?
In this work, we address the problem of measuring and predicting temporal
video saliency - a metric which defines the importance of a video frame for
human attention. Unlike the conventional spatial saliency which defines the
location of the salient regions within a frame (as it is done for still
images), temporal saliency considers importance of a frame as a whole and may
not exist apart from context. The proposed interface is an interactive
cursor-based algorithm for collecting experimental data about temporal
saliency. We collect the first human responses and perform their analysis. As a
result, we show that qualitatively, the produced scores have very explicit
meaning of the semantic changes in a frame, while quantitatively being highly
correlated between all the observers. Apart from that, we show that the
proposed tool can simultaneously collect fixations similar to the ones produced
by eye-tracker in a more affordable way. Further, this approach may be used for
creation of first temporal saliency datasets which will allow training
computational predictive algorithms. The proposed interface does not rely on
any special equipment, which allows to run it remotely and cover a wide
audience.Comment: CHI'20 Late Breaking Work
Deep learning investigation for chess player attention prediction using eye-tracking and game data
This article reports on an investigation of the use of convolutional neural
networks to predict the visual attention of chess players. The visual attention
model described in this article has been created to generate saliency maps that
capture hierarchical and spatial features of chessboard, in order to predict
the probability fixation for individual pixels Using a skip-layer architecture
of an autoencoder, with a unified decoder, we are able to use multiscale
features to predict saliency of part of the board at different scales, showing
multiple relations between pieces. We have used scan path and fixation data
from players engaged in solving chess problems, to compute 6600 saliency maps
associated to the corresponding chess piece configurations. This corpus is
completed with synthetically generated data from actual games gathered from an
online chess platform. Experiments realized using both scan-paths from chess
players and the CAT2000 saliency dataset of natural images, highlights several
results. Deep features, pretrained on natural images, were found to be helpful
in training visual attention prediction for chess. The proposed neural network
architecture is able to generate meaningful saliency maps on unseen chess
configurations with good scores on standard metrics. This work provides a
baseline for future work on visual attention prediction in similar contexts
Multi-scale Discriminant Saliency with Wavelet-based Hidden Markov Tree Modelling
The bottom-up saliency, an early stage of humans' visual attention, can be
considered as a binary classification problem between centre and surround
classes. Discriminant power of features for the classification is measured as
mutual information between distributions of image features and corresponding
classes . As the estimated discrepancy very much depends on considered scale
level, multi-scale structure and discriminant power are integrated by employing
discrete wavelet features and Hidden Markov Tree (HMT). With wavelet
coefficients and Hidden Markov Tree parameters, quad-tree like label structures
are constructed and utilized in maximum a posterior probability (MAP) of hidden
class variables at corresponding dyadic sub-squares. Then, a saliency value for
each square block at each scale level is computed with discriminant power
principle. Finally, across multiple scales is integrated the final saliency map
by an information maximization rule. Both standard quantitative tools such as
NSS, LCC, AUC and qualitative assessments are used for evaluating the proposed
multi-scale discriminant saliency (MDIS) method against the well-know
information based approach AIM on its released image collection with
eye-tracking data. Simulation results are presented and analysed to verify the
validity of MDIS as well as point out its limitation for further research
direction.Comment: arXiv admin note: substantial text overlap with arXiv:1301.396
- …