7,161 research outputs found
Modeling Human Visual Search Performance on Realistic Webpages Using Analytical and Deep Learning Methods
Modeling visual search not only offers an opportunity to predict the
usability of an interface before actually testing it on real users, but also
advances scientific understanding about human behavior. In this work, we first
conduct a set of analyses on a large-scale dataset of visual search tasks on
realistic webpages. We then present a deep neural network that learns to
predict the scannability of webpage content, i.e., how easy it is for a user to
find a specific target. Our model leverages both heuristic-based features such
as target size and unstructured features such as raw image pixels. This
approach allows us to model complex interactions that might be involved in a
realistic visual search task, which can not be easily achieved by traditional
analytical models. We analyze the model behavior to offer our insights into how
the salience map learned by the model aligns with human intuition and how the
learned semantic representation of each target type relates to its visual
search performance.Comment: the 2020 CHI Conference on Human Factors in Computing System
A Reverse Hierarchy Model for Predicting Eye Fixations
A number of psychological and physiological evidences suggest that early
visual attention works in a coarse-to-fine way, which lays a basis for the
reverse hierarchy theory (RHT). This theory states that attention propagates
from the top level of the visual hierarchy that processes gist and abstract
information of input, to the bottom level that processes local details.
Inspired by the theory, we develop a computational model for saliency detection
in images. First, the original image is downsampled to different scales to
constitute a pyramid. Then, saliency on each layer is obtained by image
super-resolution reconstruction from the layer above, which is defined as
unpredictability from this coarse-to-fine reconstruction. Finally, saliency on
each layer of the pyramid is fused into stochastic fixations through a
probabilistic model, where attention initiates from the top layer and
propagates downward through the pyramid. Extensive experiments on two standard
eye-tracking datasets show that the proposed method can achieve competitive
results with state-of-the-art models.Comment: CVPR 2014, 27th IEEE Conference on Computer Vision and Pattern
Recognition (CVPR). CVPR 201
- …