15 research outputs found
Fully Convolutional Network with Multi-Step Reinforcement Learning for Image Processing
This paper tackles a new problem setting: reinforcement learning with
pixel-wise rewards (pixelRL) for image processing. After the introduction of
the deep Q-network, deep RL has been achieving great success. However, the
applications of deep RL for image processing are still limited. Therefore, we
extend deep RL to pixelRL for various image processing applications. In
pixelRL, each pixel has an agent, and the agent changes the pixel value by
taking an action. We also propose an effective learning method for pixelRL that
significantly improves the performance by considering not only the future
states of the own pixel but also those of the neighbor pixels. The proposed
method can be applied to some image processing tasks that require pixel-wise
manipulations, where deep RL has never been applied. We apply the proposed
method to three image processing tasks: image denoising, image restoration, and
local color enhancement. Our experimental results demonstrate that the proposed
method achieves comparable or better performance, compared with the
state-of-the-art methods based on supervised learning.Comment: Accepted to AAAI 201
Solving Continual Combinatorial Selection via Deep Reinforcement Learning
We consider the Markov Decision Process (MDP) of selecting a subset of items
at each step, termed the Select-MDP (S-MDP). The large state and action spaces
of S-MDPs make them intractable to solve with typical reinforcement learning
(RL) algorithms especially when the number of items is huge. In this paper, we
present a deep RL algorithm to solve this issue by adopting the following key
ideas. First, we convert the original S-MDP into an Iterative Select-MDP
(IS-MDP), which is equivalent to the S-MDP in terms of optimal actions. IS-MDP
decomposes a joint action of selecting K items simultaneously into K iterative
selections resulting in the decrease of actions at the expense of an
exponential increase of states. Second, we overcome this state space explo-sion
by exploiting a special symmetry in IS-MDPs with novel weight shared
Q-networks, which prov-ably maintain sufficient expressive power. Various
experiments demonstrate that our approach works well even when the item space
is large and that it scales to environments with item spaces different from
those used in training.Comment: Accepted to IJCAI 2019,14 pages,8 figure