2,062 research outputs found

    Personalization of Saliency Estimation

    Full text link
    Most existing saliency models use low-level features or task descriptions when generating attention predictions. However, the link between observer characteristics and gaze patterns is rarely investigated. We present a novel saliency prediction technique which takes viewers' identities and personal traits into consideration when modeling human attention. Instead of only computing image salience for average observers, we consider the interpersonal variation in the viewing behaviors of observers with different personal traits and backgrounds. We present an enriched derivative of the GAN network, which is able to generate personalized saliency predictions when fed with image stimuli and specific information about the observer. Our model contains a generator which generates grayscale saliency heat maps based on the image and an observer label. The generator is paired with an adversarial discriminator which learns to distinguish generated salience from ground truth salience. The discriminator also has the observer label as an input, which contributes to the personalization ability of our approach. We evaluate the performance of our personalized salience model by comparison with a benchmark model along with other un-personalized predictions, and illustrate improvements in prediction accuracy for all tested observer groups

    Few-shot Personalized Saliency Prediction Based on Inter-personnel Gaze Patterns

    Full text link
    This paper presents few-shot personalized saliency prediction based on inter-personnel gaze patterns. In contrast to a general saliency map, a personalized saliecny map (PSM) has been great potential since its map indicates the person-specific visual attention that is useful for obtaining individual visual preferences from heterogeneity of gazed areas. The PSM prediction is needed for acquiring the PSM for the unseen image, but its prediction is still a challenging task due to the complexity of individual gaze patterns. For modeling individual gaze patterns for various images, although the eye-tracking data obtained from each person is necessary to construct PSMs, it is difficult to acquire the massive amounts of such data. Here, one solution for efficient PSM prediction from the limited amount of data can be the effective use of eye-tracking data obtained from other persons. In this paper, to effectively treat the PSMs of other persons, we focus on the effective selection of images to acquire eye-tracking data and the preservation of structural information of PSMs of other persons. In the experimental results, we confirm that the above two focuses are effective for the PSM prediction with the limited amount of eye-tracking data.Comment: 5pages, 3 figure

    A scalable saliency-based Feature selection method with instance level information

    Get PDF
    Classic feature selection techniques remove those features that are either irrelevant or redundant, achieving a subset of relevant features that help to provide a better knowledge extraction. This allows the creation of compact models that are easier to interpret. Most of these techniques work over the whole dataset, but they are unable to provide the user with successful information when only instance information is needed. In short, given any example, classic feature selection algorithms do not give any information about which the most relevant information is, regarding this sample. This work aims to overcome this handicap by developing a novel feature selection method, called Saliency-based Feature Selection (SFS), based in deep-learning saliency techniques. Our experimental results will prove that this algorithm can be successfully used not only in Neural Networks, but also under any given architecture trained by using Gradient Descent techniques

    A novel user-centered design for personalized video summarization

    Get PDF
    In the past, several automatic video summarization systems had been proposed to generate video summary. However, a generic video summary that is generated based only on audio, visual and textual saliencies will not satisfy every user. This paper proposes a novel system for generating semantically meaningful personalized video summaries, which are tailored to the individual user's preferences over video semantics. Each video shot is represented using a semantic multinomial which is a vector of posterior semantic concept probabilities. The proposed system stitches video summary based on summary time span and top-ranked shots that are semantically relevant to the user's preferences. The proposed summarization system is evaluated using both quantitative and subjective evaluation metrics. The experimental results on the performance of the proposed video summarization system are encouraging

    Deep learning investigation for chess player attention prediction using eye-tracking and game data

    Get PDF
    This article reports on an investigation of the use of convolutional neural networks to predict the visual attention of chess players. The visual attention model described in this article has been created to generate saliency maps that capture hierarchical and spatial features of chessboard, in order to predict the probability fixation for individual pixels Using a skip-layer architecture of an autoencoder, with a unified decoder, we are able to use multiscale features to predict saliency of part of the board at different scales, showing multiple relations between pieces. We have used scan path and fixation data from players engaged in solving chess problems, to compute 6600 saliency maps associated to the corresponding chess piece configurations. This corpus is completed with synthetically generated data from actual games gathered from an online chess platform. Experiments realized using both scan-paths from chess players and the CAT2000 saliency dataset of natural images, highlights several results. Deep features, pretrained on natural images, were found to be helpful in training visual attention prediction for chess. The proposed neural network architecture is able to generate meaningful saliency maps on unseen chess configurations with good scores on standard metrics. This work provides a baseline for future work on visual attention prediction in similar contexts
    corecore