26,711 research outputs found
SAVOIAS: A Diverse, Multi-Category Visual Complexity Dataset
Visual complexity identifies the level of intricacy and details in an image
or the level of difficulty to describe the image. It is an important concept in
a variety of areas such as cognitive psychology, computer vision and
visualization, and advertisement. Yet, efforts to create large, downloadable
image datasets with diverse content and unbiased groundtruthing are lacking. In
this work, we introduce Savoias, a visual complexity dataset that compromises
of more than 1,400 images from seven image categories relevant to the above
research areas, namely Scenes, Advertisements, Visualization and infographics,
Objects, Interior design, Art, and Suprematism. The images in each category
portray diverse characteristics including various low-level and high-level
features, objects, backgrounds, textures and patterns, text, and graphics. The
ground truth for Savoias is obtained by crowdsourcing more than 37,000 pairwise
comparisons of images using the forced-choice methodology and with more than
1,600 contributors. The resulting relative scores are then converted to
absolute visual complexity scores using the Bradley-Terry method and matrix
completion. When applying five state-of-the-art algorithms to analyze the
visual complexity of the images in the Savoias dataset, we found that the
scores obtained from these baseline tools only correlate well with crowdsourced
labels for abstract patterns in the Suprematism category (Pearson correlation
r=0.84). For the other categories, in particular, the objects and advertisement
categories, low correlation coefficients were revealed (r=0.3 and 0.56,
respectively). These findings suggest that (1) state-of-the-art approaches are
mostly insufficient and (2) Savoias enables category-specific method
development, which is likely to improve the impact of visual complexity
analysis on specific application areas, including computer vision.Comment: 10 pages, 4 figures, 4 table
An Universal Image Attractiveness Ranking Framework
We propose a new framework to rank image attractiveness using a novel
pairwise deep network trained with a large set of side-by-side multi-labeled
image pairs from a web image index. The judges only provide relative ranking
between two images without the need to directly assign an absolute score, or
rate any predefined image attribute, thus making the rating more intuitive and
accurate. We investigate a deep attractiveness rank net (DARN), a combination
of deep convolutional neural network and rank net, to directly learn an
attractiveness score mean and variance for each image and the underlying
criteria the judges use to label each pair. The extension of this model
(DARN-V2) is able to adapt to individual judge's personal preference. We also
show the attractiveness of search results are significantly improved by using
this attractiveness information in a real commercial search engine. We evaluate
our model against other state-of-the-art models on our side-by-side web test
data and another public aesthetic data set. With much less judgments (1M vs
50M), our model outperforms on side-by-side labeled data, and is comparable on
data labeled by absolute score.Comment: Accepted by 2019 Winter Conference on Application of Computer Vision
(WACV
Ranking News-Quality Multimedia
News editors need to find the photos that best illustrate a news piece and
fulfill news-media quality standards, while being pressed to also find the most
recent photos of live events. Recently, it became common to use social-media
content in the context of news media for its unique value in terms of immediacy
and quality. Consequently, the amount of images to be considered and filtered
through is now too much to be handled by a person. To aid the news editor in
this process, we propose a framework designed to deliver high-quality,
news-press type photos to the user. The framework, composed of two parts, is
based on a ranking algorithm tuned to rank professional media highly and a
visual SPAM detection module designed to filter-out low-quality media. The core
ranking algorithm is leveraged by aesthetic, social and deep-learning semantic
features. Evaluation showed that the proposed framework is effective at finding
high-quality photos (true-positive rate) achieving a retrieval MAP of 64.5% and
a classification precision of 70%.Comment: To appear in ICMR'1
Aesthetic-Driven Image Enhancement by Adversarial Learning
We introduce EnhanceGAN, an adversarial learning based model that performs
automatic image enhancement. Traditional image enhancement frameworks typically
involve training models in a fully-supervised manner, which require expensive
annotations in the form of aligned image pairs. In contrast to these
approaches, our proposed EnhanceGAN only requires weak supervision (binary
labels on image aesthetic quality) and is able to learn enhancement operators
for the task of aesthetic-based image enhancement. In particular, we show the
effectiveness of a piecewise color enhancement module trained with weak
supervision, and extend the proposed EnhanceGAN framework to learning a deep
filtering-based aesthetic enhancer. The full differentiability of our image
enhancement operators enables the training of EnhanceGAN in an end-to-end
manner. We further demonstrate the capability of EnhanceGAN in learning
aesthetic-based image cropping without any groundtruth cropping pairs. Our
weakly-supervised EnhanceGAN reports competitive quantitative results on
aesthetic-based color enhancement as well as automatic image cropping, and a
user study confirms that our image enhancement results are on par with or even
preferred over professional enhancement
- …