88,752 research outputs found
Representation Learning for Continuous Action Spaces is Beneficial for Efficient Policy Learning
Deep reinforcement learning (DRL) breaks through the bottlenecks of
traditional reinforcement learning (RL) with the help of the perception
capability of deep learning and has been widely applied in real-world
problems.While model-free RL, as a class of efficient DRL methods, performs the
learning of state representations simultaneously with policy learning in an
end-to-end manner when facing large-scale continuous state and action spaces.
However, training such a large policy model requires a large number of
trajectory samples and training time. On the other hand, the learned policy
often fails to generalize to large-scale action spaces, especially for the
continuous action spaces. To address this issue, in this paper we propose an
efficient policy learning method in latent state and action spaces. More
specifically, we extend the idea of state representations to action
representations for better policy generalization capability. Meanwhile, we
divide the whole learning task into learning with the large-scale
representation models in an unsupervised manner and learning with the
small-scale policy model in the RL manner.The small policy model facilitates
policy learning, while not sacrificing generalization and expressiveness via
the large representation model. Finally,the effectiveness of the proposed
method is demonstrated by MountainCar,CarRacing and Cheetah experiments
Low-Dimensional State and Action Representation Learning with MDP Homomorphism Metrics
Deep Reinforcement Learning has shown its ability in solving complicated problems directly from high-dimensional observations. However, in end-to-end settings, Reinforcement Learning algorithms are not sample-efficient and requires long training times and quantities of data. In this work, we proposed a framework for sample-efficient Reinforcement Learning that take advantage of state and action representations to transform a high-dimensional problem into a low-dimensional one. Moreover, we seek to find the optimal policy mapping latent states to latent actions. Because now the policy is learned on abstract representations, we enforce, using auxiliary loss functions, the lifting of such policy to the original problem domain. Results show that the novel framework can efficiently learn low-dimensional and interpretable state and action representations and the optimal latent policy
Language Understanding for Text-based Games Using Deep Reinforcement Learning
In this paper, we consider the task of learning control policies for
text-based games. In these games, all interactions in the virtual world are
through text and the underlying state is not observed. The resulting language
barrier makes such environments challenging for automatic game players. We
employ a deep reinforcement learning framework to jointly learn state
representations and action policies using game rewards as feedback. This
framework enables us to map text descriptions into vector representations that
capture the semantics of the game states. We evaluate our approach on two game
worlds, comparing against baselines using bag-of-words and bag-of-bigrams for
state representations. Our algorithm outperforms the baselines on both worlds
demonstrating the importance of learning expressive representations.Comment: 11 pages, Appearing at EMNLP, 201
- …