38 research outputs found
Could you guess an interesting movie from the posters?: An evaluation of vision-based features on movie poster database
In this paper, we aim to estimate the Winner of world-wide film festival from
the exhibited movie poster. The task is an extremely challenging because the
estimation must be done with only an exhibited movie poster, without any film
ratings and box-office takings. In order to tackle this problem, we have
created a new database which is consist of all movie posters included in the
four biggest film festivals. The movie poster database (MPDB) contains historic
movies over 80 years which are nominated a movie award at each year. We apply a
couple of feature types, namely hand-craft, mid-level and deep feature to
extract various information from a movie poster. Our experiments showed
suggestive knowledge, for example, the Academy award estimation can be better
rate with a color feature and a facial emotion feature generally performs good
rate on the MPDB. The paper may suggest a possibility of modeling human taste
for a movie recommendation.Comment: 4 pages, 4 figure
Aplicació per a la predicció automàtica de la data d'adquisició d'una imatge
En aquesta memòria es presenta un projecte que té com a objectiu principal la creació d'una aplicació que, donada una imatge d'entrada, retorni en quina dècada va ser adquirida la imatge. Apart es pretén millorar els resultats obtinguts en altres estudis sobre la classificació d'imatges segons la seva data d'adquisició. En la memòria s'explica l'estudi realitzat anteriorment sobre el tema i quin ha estat el mètode escollit per millorar els seus resultats. També s'explica com hem creat l'aplicació i els passos que segueix l'aplicació en la seva execució.En esta memoria se presenta un proyecto que tiene como a objetivo principal la creación de una aplicación que, dada una imagen de entrada, devuelva en qué década fue adquirida la imagen. Aparte se pretende mejorar los resultados obtenidos en otro estudio sobre la clasificación de imágenes según su fecha de adquisición. En la memoria se explica el estudio realizado anteriormente sobre el tema y cuál ha sido el método escogido para mejorar los resultados. También se explica como hemos creado la aplicación i los pasos que sigue la aplicación en su ejecución.In this project, we present a project that has the goal of developing an application to predict the decade in which an image was acquired. Another aim of the project is to improve the results of a previous study about image classification in terms of the acquisition date of images. The report covers the previous study on this subject and the method that has been chosen to improve the performance of the previous work. The report also explains how the application has been created and the steps that this application follows when it is executed
Seeing Behind the Camera: Identifying the Authorship of a Photograph
We introduce the novel problem of identifying the photographer behind a
photograph. To explore the feasibility of current computer vision techniques to
address this problem, we created a new dataset of over 180,000 images taken by
41 well-known photographers. Using this dataset, we examined the effectiveness
of a variety of features (low and high-level, including CNN features) at
identifying the photographer. We also trained a new deep convolutional neural
network for this task. Our results show that high-level features greatly
outperform low-level features. We provide qualitative results using these
learned models that give insight into our method's ability to distinguish
between photographers, and allow us to draw interesting conclusions about what
specific photographers shoot. We also demonstrate two applications of our
method.Comment: Dataset downloadable at http://www.cs.pitt.edu/~chris/photographer To
Appear in CVPR 201
Visual7W: Grounded Question Answering in Images
We have seen great progress in basic perceptual tasks such as object
recognition and detection. However, AI models still fail to match humans in
high-level vision tasks due to the lack of capacities for deeper reasoning.
Recently the new task of visual question answering (QA) has been proposed to
evaluate a model's capacity for deep image understanding. Previous works have
established a loose, global association between QA sentences and images.
However, many questions and answers, in practice, relate to local regions in
the images. We establish a semantic link between textual descriptions and image
regions by object-level grounding. It enables a new type of QA with visual
answers, in addition to textual answers used in previous work. We study the
visual QA tasks in a grounded setting with a large collection of 7W
multiple-choice QA pairs. Furthermore, we evaluate human performance and
several baseline models on the QA tasks. Finally, we propose a novel LSTM model
with spatial attention to tackle the 7W QA tasks.Comment: CVPR 201
Location recognition over large time lags
Would it be possible to automatically associate ancient pictures to modern ones and create fancy cultural heritage city maps? We introduce here the task of recognizing the location depicted in an old photo given modern annotated images collected from the Internet. We present an extensive analysis on different features, looking for the most discriminative and most robust to the image variability induced by large time lags. Moreover, we show that the described task benefits from domain adaptation
“When Was This Picture Taken?” – Image Date Estimation in the Wild
The problem of automatically estimating the creation date of photos has been addressed rarely in the past. In this paper, we introduce a novel dataset Date Estimation in the Wild for the task of predicting the acquisition year of images captured in the period from 1930 to 1999. In contrast to previous work, the dataset is neither restricted to color photography nor to specific visual concepts. The dataset consists of more than one million images crawled from Flickr and contains a large number of different motives. In addition, we propose two baseline approaches for regression and classification, respectively, relying on state-of-the-art deep convolutional neural networks. Experimental results demonstrate that these baselines are already superior to annotations of untrained humans