Search CORE

462 research outputs found

How to Make an Image More Memorable? A Deep Style Transfer Approach

Author: Alameda-Pineda Xavier
Majtanovic Cveta
Ricci Elisa
Sebe Nicu
Siarohin Aliaksandr
Zen Gloria
Publication venue
Publication date: 06/04/2017
Field of study

Recent works have shown that it is possible to automatically predict intrinsic image properties like memorability. In this paper, we take a step forward addressing the question: "Can we make an image more memorable?". Methods for automatically increasing image memorability would have an impact in many application fields like education, gaming or advertising. Our work is inspired by the popular editing-by-applying-filters paradigm adopted in photo editing applications, like Instagram and Prisma. In this context, the problem of increasing image memorability maps to that of retrieving "memorabilizing" filters or style "seeds". Still, users generally have to go through most of the available filters before finding the desired solution, thus turning the editing process into a resource and time consuming task. In this work, we show that it is possible to automatically retrieve the best style seeds for a given image, thus remarkably reducing the number of human attempts needed to find a good match. Our approach leverages from recent advances in the field of image synthesis and adopts a deep architecture for generating a memorable picture from a given input image and a style seed. Importantly, to automatically select the best style a novel learning-based solution, also relying on deep models, is proposed. Our experimental evaluation, conducted on publicly available benchmarks, demonstrates the effectiveness of the proposed approach for generating memorable images through automatic style seed selectionComment: Accepted at ACM ICMR 201

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Understanding and Predicting Image Memorability at a Large Scale

Author: Khosla Aditya
Oliva Aude
Raju Akhil G.
Torralba Antonio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2015
Field of study

Progress in estimating visual memorability has been limited by the small scale and lack of variety of benchmark data. Here, we introduce a novel experimental procedure to objectively measure human memory, allowing us to build LaMem, the largest annotated image memorability dataset to date (containing 60,000 images from diverse sources). Using Convolutional Neural Networks (CNNs), we show that fine-tuned deep features outperform all other features by a large margin, reaching a rank correlation of 0.64, near human consistency (0.68). Analysis of the responses of the high-level CNN layers shows which objects and regions are positively, and negatively, correlated with memorability, allowing us to create memorability maps for each image and provide a concrete method to perform image memorability manipulation. This work demonstrates that one can now robustly estimate the memorability of images from many different classes, positioning memorability and deep memorability features as prime candidates to estimate the utility of information for cognitive systems. Our model and data are available at: http://memorability.csail.mit.edu.National Science Foundation (U.S.) (Grant 1532591)McGovern Institute for Brain Research at MIT. Neurotechnology (MINT) ProgramMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory. MIT Big Data InitiativeGoogle (Firm)Xerox Corporatio

DSpace@MIT

Crossref

Viraliency: Pooling Local Virality

Author: Alameda-Pineda Xavier
Pilzer Andrea
Ricci Elisa
Sebe Nicu
Xu Dan
Publication venue
Publication date: 15/03/2017
Field of study

In our overly-connected world, the automatic recognition of virality - the quality of an image or video to be rapidly and widely spread in social networks - is of crucial importance, and has recently awaken the interest of the computer vision community. Concurrently, recent progress in deep learning architectures showed that global pooling strategies allow the extraction of activation maps, which highlight the parts of the image most likely to contain instances of a certain class. We extend this concept by introducing a pooling layer that learns the size of the support area to be averaged: the learned top-N average (LENA) pooling. We hypothesize that the latent concepts (feature maps) describing virality may require such a rich pooling strategy. We assess the effectiveness of the LENA layer by appending it on top of a convolutional siamese architecture and evaluate its performance on the task of predicting and localizing virality. We report experiments on two publicly available datasets annotated for virality and show that our method outperforms state-of-the-art approaches.Comment: Accepted at IEEE CVPR 201

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Evaluating Content-centric vs User-centric Ad Affect Recognition

Author: Gullapuram Shruti Shriya
Kankanhalli Mohan
Katti Harish
Shukla Abhinav
Subramanian Ramanathan
Yadati Karthik
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Despite the fact that advertisements (ads) often include strongly emotional content, very little work has been devoted to affect recognition (AR) from ads. This work explicitly compares content-centric and user-centric ad AR methodologies, and evaluates the impact of enhanced AR on computational advertising via a user study. Specifically, we (1) compile an affective ad dataset capable of evoking coherent emotions across users; (2) explore the efficacy of content-centric convolutional neural network (CNN) features for encoding emotions, and show that CNN features outperform low-level emotion descriptors; (3) examine user-centered ad AR by analyzing Electroencephalogram (EEG) responses acquired from eleven viewers, and find that EEG signals encode emotional information better than content descriptors; (4) investigate the relationship between objective AR and subjective viewer experience while watching an ad-embedded online video stream based on a study involving 12 users. To our knowledge, this is the first work to (a) expressly compare user vs content-centered AR for ads, and (b) study the relationship between modeling of ad emotions and its impact on a real-life advertising application.Comment: Accepted at the ACM International Conference on Multimodal Interation (ICMI) 201

arXiv.org e-Print Archive

Crossref

University of Canberra Research Repository

Enlighten

Is Image Memorability Prediction Solved?

Author: Perera Shay
Tal Ayellet
Zelnik-Manor Lihi
Publication venue
Publication date: 31/01/2019
Field of study

This paper deals with the prediction of the memorability of a given image. We start by proposing an algorithm that reaches human-level performance on the LaMem dataset - the only large scale benchmark for memorability prediction. The suggested algorithm is based on three observations we make regarding convolutional neural networks (CNNs) that affect memorability prediction. Having reached human-level performance we were humbled, and asked ourselves whether indeed we have resolved memorability prediction - and answered this question in the negative. We studied a few factors and made some recommendations that should be taken into account when designing the next benchmark

arXiv.org e-Print Archive

Crossref