2,618 research outputs found
Gaze Distribution Analysis and Saliency Prediction Across Age Groups
Knowledge of the human visual system helps to develop better computational
models of visual attention. State-of-the-art models have been developed to
mimic the visual attention system of young adults that, however, largely ignore
the variations that occur with age. In this paper, we investigated how visual
scene processing changes with age and we propose an age-adapted framework that
helps to develop a computational model that can predict saliency across
different age groups. Our analysis uncovers how the explorativeness of an
observer varies with age, how well saliency maps of an age group agree with
fixation points of observers from the same or different age groups, and how age
influences the center bias. We analyzed the eye movement behavior of 82
observers belonging to four age groups while they explored visual scenes.
Explorativeness was quantified in terms of the entropy of a saliency map, and
area under the curve (AUC) metrics was used to quantify the agreement analysis
and the center bias. These results were used to develop age adapted saliency
models. Our results suggest that the proposed age-adapted saliency model
outperforms existing saliency models in predicting the regions of interest
across age groups
How is Gaze Influenced by Image Transformations? Dataset and Model
Data size is the bottleneck for developing deep saliency models, because
collecting eye-movement data is very time consuming and expensive. Most of
current studies on human attention and saliency modeling have used high quality
stereotype stimuli. In real world, however, captured images undergo various
types of transformations. Can we use these transformations to augment existing
saliency datasets? Here, we first create a novel saliency dataset including
fixations of 10 observers over 1900 images degraded by 19 types of
transformations. Second, by analyzing eye movements, we find that observers
look at different locations over transformed versus original images. Third, we
utilize the new data over transformed images, called data augmentation
transformation (DAT), to train deep saliency models. We find that label
preserving DATs with negligible impact on human gaze boost saliency prediction,
whereas some other DATs that severely impact human gaze degrade the
performance. These label preserving valid augmentation transformations provide
a solution to enlarge existing saliency datasets. Finally, we introduce a novel
saliency model based on generative adversarial network (dubbed GazeGAN). A
modified UNet is proposed as the generator of the GazeGAN, which combines
classic skip connections with a novel center-surround connection (CSC), in
order to leverage multi level features. We also propose a histogram loss based
on Alternative Chi Square Distance (ACS HistLoss) to refine the saliency map in
terms of luminance distribution. Extensive experiments and comparisons over 3
datasets indicate that GazeGAN achieves the best performance in terms of
popular saliency evaluation metrics, and is more robust to various
perturbations. Our code and data are available at:
https://github.com/CZHQuality/Sal-CFS-GAN
Saliency Prediction for Mobile User Interfaces
We introduce models for saliency prediction for mobile user interfaces. A
mobile interface may include elements like buttons, text, etc. in addition to
natural images which enable performing a variety of tasks. Saliency in natural
images is a well studied area. However, given the difference in what
constitutes a mobile interface, and the usage context of these devices, we
postulate that saliency prediction for mobile interface images requires a fresh
approach. Mobile interface design involves operating on elements, the building
blocks of the interface. We first collected eye-gaze data from mobile devices
for free viewing task. Using this data, we develop a novel autoencoder based
multi-scale deep learning model that provides saliency prediction at the mobile
interface element level. Compared to saliency prediction approaches developed
for natural images, we show that our approach performs significantly better on
a range of established metrics.Comment: Paper accepted at WACV 201
Are all the frames equally important?
In this work, we address the problem of measuring and predicting temporal
video saliency - a metric which defines the importance of a video frame for
human attention. Unlike the conventional spatial saliency which defines the
location of the salient regions within a frame (as it is done for still
images), temporal saliency considers importance of a frame as a whole and may
not exist apart from context. The proposed interface is an interactive
cursor-based algorithm for collecting experimental data about temporal
saliency. We collect the first human responses and perform their analysis. As a
result, we show that qualitatively, the produced scores have very explicit
meaning of the semantic changes in a frame, while quantitatively being highly
correlated between all the observers. Apart from that, we show that the
proposed tool can simultaneously collect fixations similar to the ones produced
by eye-tracker in a more affordable way. Further, this approach may be used for
creation of first temporal saliency datasets which will allow training
computational predictive algorithms. The proposed interface does not rely on
any special equipment, which allows to run it remotely and cover a wide
audience.Comment: CHI'20 Late Breaking Work
- …