1 research outputs found
Ensemble Network for Ranking Images Based on Visual Appeal
We propose a computational framework for ranking images (group photos in
particular) taken at the same event within a short time span. The ranking is
expected to correspond with human perception of overall appeal of the images.
We hypothesize and provide evidence through subjective analysis that the
factors that appeal to humans are its emotional content, aesthetics and image
quality. We propose a network which is an ensemble of three information
channels, each predicting a score corresponding to one of the three visual
appeal factors. For group emotion estimation, we propose a convolutional neural
network (CNN) based architecture for predicting group emotion from images. This
new architecture enforces the network to put emphasis on the important regions
in the images, and achieves comparable results to the state-of-the-art. Next,
we develop a network for the image ranking task that combines group emotion,
aesthetics and image quality scores. Owing to the unavailability of suitable
databases, we created a new database of manually annotated group photos taken
during various social events. We present experimental results on this database
and other benchmark databases whenever available. Overall, our experiments show
that the proposed framework can reliably predict the overall appeal of images
with results closely corresponding to human ranking