12 research outputs found
Fully Convolutional Network with Multi-Step Reinforcement Learning for Image Processing
This paper tackles a new problem setting: reinforcement learning with
pixel-wise rewards (pixelRL) for image processing. After the introduction of
the deep Q-network, deep RL has been achieving great success. However, the
applications of deep RL for image processing are still limited. Therefore, we
extend deep RL to pixelRL for various image processing applications. In
pixelRL, each pixel has an agent, and the agent changes the pixel value by
taking an action. We also propose an effective learning method for pixelRL that
significantly improves the performance by considering not only the future
states of the own pixel but also those of the neighbor pixels. The proposed
method can be applied to some image processing tasks that require pixel-wise
manipulations, where deep RL has never been applied. We apply the proposed
method to three image processing tasks: image denoising, image restoration, and
local color enhancement. Our experimental results demonstrate that the proposed
method achieves comparable or better performance, compared with the
state-of-the-art methods based on supervised learning.Comment: Accepted to AAAI 201
Self-Play Reinforcement Learning for Fast Image Retargeting
In this study, we address image retargeting, which is a task that adjusts
input images to arbitrary sizes. In one of the best-performing methods called
MULTIOP, multiple retargeting operators were combined and retargeted images at
each stage were generated to find the optimal sequence of operators that
minimized the distance between original and retargeted images. The limitation
of this method is in its tremendous processing time, which severely prohibits
its practical use. Therefore, the purpose of this study is to find the optimal
combination of operators within a reasonable processing time; we propose a
method of predicting the optimal operator for each step using a reinforcement
learning agent. The technical contributions of this study are as follows.
Firstly, we propose a reward based on self-play, which will be insensitive to
the large variance in the content-dependent distance measured in MULTIOP.
Secondly, we propose to dynamically change the loss weight for each action to
prevent the algorithm from falling into a local optimum and from choosing only
the most frequently used operator in its training. Our experiments showed that
we achieved multi-operator image retargeting with less processing time by three
orders of magnitude and the same quality as the original multi-operator-based
method, which was the best-performing algorithm in retargeting tasks.Comment: Accepted to ACM Multimedia 202