75 research outputs found
Language Understanding for Text-based Games Using Deep Reinforcement Learning
In this paper, we consider the task of learning control policies for
text-based games. In these games, all interactions in the virtual world are
through text and the underlying state is not observed. The resulting language
barrier makes such environments challenging for automatic game players. We
employ a deep reinforcement learning framework to jointly learn state
representations and action policies using game rewards as feedback. This
framework enables us to map text descriptions into vector representations that
capture the semantics of the game states. We evaluate our approach on two game
worlds, comparing against baselines using bag-of-words and bag-of-bigrams for
state representations. Our algorithm outperforms the baselines on both worlds
demonstrating the importance of learning expressive representations.Comment: 11 pages, Appearing at EMNLP, 201
Grounding Language for Transfer in Deep Reinforcement Learning
In this paper, we explore the utilization of natural language to drive
transfer for reinforcement learning (RL). Despite the wide-spread application
of deep RL techniques, learning generalized policy representations that work
across domains remains a challenging problem. We demonstrate that textual
descriptions of environments provide a compact intermediate channel to
facilitate effective policy transfer. Specifically, by learning to ground the
meaning of text to the dynamics of the environment such as transitions and
rewards, an autonomous agent can effectively bootstrap policy learning on a new
domain given its description. We employ a model-based RL approach consisting of
a differentiable planning module, a model-free component and a factorized state
representation to effectively use entity descriptions. Our model outperforms
prior work on both transfer and multi-task scenarios in a variety of different
environments. For instance, we achieve up to 14% and 11.5% absolute improvement
over previously existing models in terms of average and initial rewards,
respectively.Comment: JAIR 201
Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning
Most successful information extraction systems operate with access to a large
collection of documents. In this work, we explore the task of acquiring and
incorporating external evidence to improve extraction accuracy in domains where
the amount of training data is scarce. This process entails issuing search
queries, extraction from new sources and reconciliation of extracted values,
which are repeated until sufficient evidence is collected. We approach the
problem using a reinforcement learning framework where our model learns to
select optimal actions based on contextual information. We employ a deep
Q-network, trained to optimize a reward function that reflects extraction
accuracy while penalizing extra effort. Our experiments on two databases -- of
shooting incidents, and food adulteration cases -- demonstrate that our system
significantly outperforms traditional extractors and a competitive
meta-classifier baseline.Comment: Appearing in EMNLP 2016 (12 pages incl. supplementary material
Evolving Graphical Planner: Contextual Global Planning for Vision-and-Language Navigation
The ability to perform effective planning is crucial for building an
instruction-following agent. When navigating through a new environment, an
agent is challenged with (1) connecting the natural language instructions with
its progressively growing knowledge of the world; and (2) performing long-range
planning and decision making in the form of effective exploration and error
correction. Current methods are still limited on both fronts despite extensive
efforts. In this paper, we introduce the Evolving Graphical Planner (EGP), a
model that performs global planning for navigation based on raw sensory input.
The model dynamically constructs a graphical representation, generalizes the
action space to allow for more flexible decision making, and performs efficient
planning on a proxy graph representation. We evaluate our model on a
challenging Vision-and-Language Navigation (VLN) task with photorealistic
images and achieve superior performance compared to previous navigation
architectures. For instance, we achieve a 53% success rate on the test split of
the Room-to-Room navigation task through pure imitation learning, outperforming
previous navigation architectures by up to 5%
- …