702 research outputs found
Fast-AT: Fast Automatic Thumbnail Generation using Deep Neural Networks
Fast-AT is an automatic thumbnail generation system based on deep neural
networks. It is a fully-convolutional deep neural network, which learns
specific filters for thumbnails of different sizes and aspect ratios. During
inference, the appropriate filter is selected depending on the dimensions of
the target thumbnail. Unlike most previous work, Fast-AT does not utilize
saliency but addresses the problem directly. In addition, it eliminates the
need to conduct region search on the saliency map. The model generalizes to
thumbnails of different sizes including those with extreme aspect ratios and
can generate thumbnails in real time. A data set of more than 70,000 thumbnail
annotations was collected to train Fast-AT. We show competitive results in
comparison to existing techniques
User Constrained Thumbnail Generation using Adaptive Convolutions
Thumbnails are widely used all over the world as a preview for digital
images. In this work we propose a deep neural framework to generate thumbnails
of any size and aspect ratio, even for unseen values during training, with high
accuracy and precision. We use Global Context Aggregation (GCA) and a modified
Region Proposal Network (RPN) with adaptive convolutions to generate thumbnails
in real time. GCA is used to selectively attend and aggregate the global
context information from the entire image while the RPN is used to predict
candidate bounding boxes for the thumbnail image. Adaptive convolution
eliminates the problem of generating thumbnails of various aspect ratios by
using filter weights dynamically generated from the aspect ratio information.
The experimental results indicate the superior performance of the proposed
model over existing state-of-the-art techniques.Comment: International Conference on Acoustics, Speech, and Signal
Processing(ICASSP), 201
A2-RL: Aesthetics Aware Reinforcement Learning for Image Cropping
Image cropping aims at improving the aesthetic quality of images by adjusting
their composition. Most weakly supervised cropping methods (without bounding
box supervision) rely on the sliding window mechanism. The sliding window
mechanism requires fixed aspect ratios and limits the cropping region with
arbitrary size. Moreover, the sliding window method usually produces tens of
thousands of windows on the input image which is very time-consuming. Motivated
by these challenges, we firstly formulate the aesthetic image cropping as a
sequential decision-making process and propose a weakly supervised Aesthetics
Aware Reinforcement Learning (A2-RL) framework to address this problem.
Particularly, the proposed method develops an aesthetics aware reward function
which especially benefits image cropping. Similar to human's decision making,
we use a comprehensive state representation including both the current
observation and the historical experience. We train the agent using the
actor-critic architecture in an end-to-end manner. The agent is evaluated on
several popular unseen cropping datasets. Experiment results show that our
method achieves the state-of-the-art performance with much fewer candidate
windows and much less time compared with previous weakly supervised methods.Comment: Accepted by CVPR 201
Image Cropping with Composition and Saliency Aware Aesthetic Score Map
Aesthetic image cropping is a practical but challenging task which aims at
finding the best crops with the highest aesthetic quality in an image.
Recently, many deep learning methods have been proposed to address this
problem, but they did not reveal the intrinsic mechanism of aesthetic
evaluation. In this paper, we propose an interpretable image cropping model to
unveil the mystery. For each image, we use a fully convolutional network to
produce an aesthetic score map, which is shared among all candidate crops
during crop-level aesthetic evaluation. Then, we require the aesthetic score
map to be both composition-aware and saliency-aware. In particular, the same
region is assigned with different aesthetic scores based on its relative
positions in different crops. Moreover, a visually salient region is supposed
to have more sensitive aesthetic scores so that our network can learn to place
salient objects at more proper positions. Such an aesthetic score map can be
used to localize aesthetically important regions in an image, which sheds light
on the composition rules learned by our model. We show the competitive
performance of our model in the image cropping task on several benchmark
datasets, and also demonstrate its generality in real-world applications.Comment: Accepted by AAAI 2
FAST–AT: FAST AUTOMATIC THUMBNAIL GENERATION USING DEEP NEURAL NETWORKS
Fast-AT is an automatic thumbnail generation system based on deep neural networks. It is a fully-convolutional CNN, which learns specific filters for thumbnails of different sizes and aspect ratios. During inference, the appropriate filter is selected depending on the dimensions of the target thumbnail. Unlike most previous work, Fast-AT does not utilize saliency but addresses the problem directly. In addition, it eliminates the need to conduct region search on the saliency map. The model generalizes to thumbnails of different sizes including those with extreme aspect ratios and can generate thumbnails in real time. A data set of more than 70,000 thumbnail annotations was collected to train Fast-AT. We show competitive results in comparison to existing techniques
Aesthetic-Driven Image Enhancement by Adversarial Learning
We introduce EnhanceGAN, an adversarial learning based model that performs
automatic image enhancement. Traditional image enhancement frameworks typically
involve training models in a fully-supervised manner, which require expensive
annotations in the form of aligned image pairs. In contrast to these
approaches, our proposed EnhanceGAN only requires weak supervision (binary
labels on image aesthetic quality) and is able to learn enhancement operators
for the task of aesthetic-based image enhancement. In particular, we show the
effectiveness of a piecewise color enhancement module trained with weak
supervision, and extend the proposed EnhanceGAN framework to learning a deep
filtering-based aesthetic enhancer. The full differentiability of our image
enhancement operators enables the training of EnhanceGAN in an end-to-end
manner. We further demonstrate the capability of EnhanceGAN in learning
aesthetic-based image cropping without any groundtruth cropping pairs. Our
weakly-supervised EnhanceGAN reports competitive quantitative results on
aesthetic-based color enhancement as well as automatic image cropping, and a
user study confirms that our image enhancement results are on par with or even
preferred over professional enhancement
Image Cropping under Design Constraints
Image cropping is essential in image editing for obtaining a compositionally
enhanced image. In display media, image cropping is a prospective technique for
automatically creating media content. However, image cropping for media
contents is often required to satisfy various constraints, such as an aspect
ratio and blank regions for placing texts or objects. We call this problem
image cropping under design constraints. To achieve image cropping under design
constraints, we propose a score function-based approach, which computes scores
for cropped results whether aesthetically plausible and satisfies design
constraints. We explore two derived approaches, a proposal-based approach, and
a heatmap-based approach, and we construct a dataset for evaluating the
performance of the proposed approaches on image cropping under design
constraints. In experiments, we demonstrate that the proposed approaches
outperform a baseline, and we observe that the proposal-based approach is
better than the heatmap-based approach under the same computation cost, but the
heatmap-based approach leads to better scores by increasing computation cost.
The experimental results indicate that balancing aesthetically plausible
regions and satisfying design constraints is not a trivial problem and requires
sensitive balance, and both proposed approaches are reasonable alternatives.Comment: ACMMM Asia accepte
- …