2,211 research outputs found
Personalization of Saliency Estimation
Most existing saliency models use low-level features or task descriptions
when generating attention predictions. However, the link between observer
characteristics and gaze patterns is rarely investigated. We present a novel
saliency prediction technique which takes viewers' identities and personal
traits into consideration when modeling human attention. Instead of only
computing image salience for average observers, we consider the interpersonal
variation in the viewing behaviors of observers with different personal traits
and backgrounds. We present an enriched derivative of the GAN network, which is
able to generate personalized saliency predictions when fed with image stimuli
and specific information about the observer. Our model contains a generator
which generates grayscale saliency heat maps based on the image and an observer
label. The generator is paired with an adversarial discriminator which learns
to distinguish generated salience from ground truth salience. The discriminator
also has the observer label as an input, which contributes to the
personalization ability of our approach. We evaluate the performance of our
personalized salience model by comparison with a benchmark model along with
other un-personalized predictions, and illustrate improvements in prediction
accuracy for all tested observer groups
Few-shot Personalized Saliency Prediction Based on Inter-personnel Gaze Patterns
This paper presents few-shot personalized saliency prediction based on
inter-personnel gaze patterns. In contrast to a general saliency map, a
personalized saliecny map (PSM) has been great potential since its map
indicates the person-specific visual attention that is useful for obtaining
individual visual preferences from heterogeneity of gazed areas. The PSM
prediction is needed for acquiring the PSM for the unseen image, but its
prediction is still a challenging task due to the complexity of individual gaze
patterns. For modeling individual gaze patterns for various images, although
the eye-tracking data obtained from each person is necessary to construct PSMs,
it is difficult to acquire the massive amounts of such data. Here, one solution
for efficient PSM prediction from the limited amount of data can be the
effective use of eye-tracking data obtained from other persons. In this paper,
to effectively treat the PSMs of other persons, we focus on the effective
selection of images to acquire eye-tracking data and the preservation of
structural information of PSMs of other persons. In the experimental results,
we confirm that the above two focuses are effective for the PSM prediction with
the limited amount of eye-tracking data.Comment: 5pages, 3 figure
A scalable saliency-based Feature selection method with instance level information
Classic feature selection techniques remove those features that are either
irrelevant or redundant, achieving a subset of relevant features that help to
provide a better knowledge extraction. This allows the creation of compact
models that are easier to interpret. Most of these techniques work over the
whole dataset, but they are unable to provide the user with successful
information when only instance information is needed. In short, given any
example, classic feature selection algorithms do not give any information about
which the most relevant information is, regarding this sample. This work aims
to overcome this handicap by developing a novel feature selection method,
called Saliency-based Feature Selection (SFS), based in deep-learning saliency
techniques. Our experimental results will prove that this algorithm can be
successfully used not only in Neural Networks, but also under any given
architecture trained by using Gradient Descent techniques
A novel user-centered design for personalized video summarization
In the past, several automatic video summarization systems had been proposed to generate video summary. However, a generic video summary that is generated based only on audio, visual and textual saliencies will not satisfy every user. This paper proposes a novel system for generating semantically meaningful personalized video summaries, which are tailored to the individual user's preferences over video semantics. Each video shot is represented using a semantic multinomial which is a vector of posterior semantic concept probabilities. The proposed system stitches video summary based on summary time span and top-ranked shots that are semantically relevant to the user's preferences. The proposed summarization system is evaluated using both quantitative and subjective evaluation metrics. The experimental results on the performance of the proposed video summarization system are encouraging
Deep learning investigation for chess player attention prediction using eye-tracking and game data
This article reports on an investigation of the use of convolutional neural
networks to predict the visual attention of chess players. The visual attention
model described in this article has been created to generate saliency maps that
capture hierarchical and spatial features of chessboard, in order to predict
the probability fixation for individual pixels Using a skip-layer architecture
of an autoencoder, with a unified decoder, we are able to use multiscale
features to predict saliency of part of the board at different scales, showing
multiple relations between pieces. We have used scan path and fixation data
from players engaged in solving chess problems, to compute 6600 saliency maps
associated to the corresponding chess piece configurations. This corpus is
completed with synthetically generated data from actual games gathered from an
online chess platform. Experiments realized using both scan-paths from chess
players and the CAT2000 saliency dataset of natural images, highlights several
results. Deep features, pretrained on natural images, were found to be helpful
in training visual attention prediction for chess. The proposed neural network
architecture is able to generate meaningful saliency maps on unseen chess
configurations with good scores on standard metrics. This work provides a
baseline for future work on visual attention prediction in similar contexts
Recommended from our members
Enabling Privacy and Trust in Edge AI Systems
Recent advances in mobile computing and the Internet of Things (IoT) enable the global integration of heterogeneous smart devices via wireless networks. A common characteristic across these modern day systems is their ability to collect and communicate streaming data, making machine learning (ML) appealing for processing, reasoning, and predicting about the environment. More recently, low network latency requirements have made offloading intelligence to the cloud undesirable. These novel requirements have led to the emergence of edge computing, an approach that brings computation closer to the device with low latency, high throughput, and enhanced reliability. Together, they enable ML-powered information processing and control pipelines spanning end devices, edge computing, and cloud environments. However, continuous collaboration between cloud, edge and device is susceptible to information leakage and loss, leading to insecure and unreliable operation. This raises an important question: how can we design, develop, and evaluate high-performing ML systems that are trustworthy and privacy-preserving in resource-constrained edge environments? In this thesis, I address this question by designing and implementing privacy-preserving and trustworthy ML systems for distributed applications. I first introduce a system that establishes trust in the explanations generated from a popular visualization technique, saliency maps, using counterfactual reasoning. Through the proposed evaluation system, I assess the degree to which hypothesized explanations correspond to the semantics of edge-based reinforcement learning environments. Second, I examine the privacy implications of personalized models in distributed mobile services by proposing time-series based model inversion attacks. To thwart such attacks, I present a distributed framework, Pelican, that learns and deploys transfer learning-based personalized ML models in a privacy preserving manner on resource-constrained mobile devices. Third, I investigate ML models that are deployed on local devices for inference and highlight the ease with which proprietary information embedded in these models can be exposed. For mitigating such attacks, I present a secure on-device application framework, SODA, which is supported by real-time adversarial detection. Finally, I present an end-to-end privacy-aware system for a real-world application to model group interaction behavior via mobility sensing. The proposed system, W4-Groups, distributes computation across device, edge, and cloud resources to strengthen its privacy and trustworthiness guarantees
- …