11 research outputs found
Interactive Language Learning by Question Answering
Humans observe and interact with the world to acquire knowledge. However,
most existing machine reading comprehension (MRC) tasks miss the interactive,
information-seeking component of comprehension. Such tasks present models with
static documents that contain all necessary information, usually concentrated
in a single short substring. Thus, models can achieve strong performance
through simple word- and phrase-based pattern matching. We address this problem
by formulating a novel text-based question answering task: Question Answering
with Interactive Text (QAit). In QAit, an agent must interact with a partially
observable text-based environment to gather information required to answer
questions. QAit poses questions about the existence, location, and attributes
of objects found in the environment. The data is built using a text-based game
generator that defines the underlying dynamics of interaction with the
environment. We propose and evaluate a set of baseline models for the QAit task
that includes deep reinforcement learning agents. Experiments show that the
task presents a major challenge for machine reading systems, while humans solve
it with relative ease.Comment: EMNLP 201
Semantics Altering Modifications for Evaluating Comprehension in Machine Reading
Advances in NLP have yielded impressive results for the task of machine
reading comprehension (MRC), with approaches having been reported to achieve
performance comparable to that of humans. In this paper, we investigate whether
state-of-the-art MRC models are able to correctly process Semantics Altering
Modifications (SAM): linguistically-motivated phenomena that alter the
semantics of a sentence while preserving most of its lexical surface form. We
present a method to automatically generate and align challenge sets featuring
original and altered examples. We further propose a novel evaluation
methodology to correctly assess the capability of MRC systems to process these
examples independent of the data they were optimised on, by discounting for
effects introduced by domain shift. In a large-scale empirical study, we apply
the methodology in order to evaluate extractive MRC models with regard to their
capability to correctly process SAM-enriched data. We comprehensively cover 12
different state-of-the-art neural architecture configurations and four training
datasets and find that -- despite their well-known remarkable performance --
optimised models consistently struggle to correctly process semantically
altered data.Comment: AAAI 2021, final version. 7 pages content + 2 pages reference
Interactive Fiction Games: A Colossal Adventure
A hallmark of human intelligence is the ability to understand and communicate
with language. Interactive Fiction games are fully text-based simulation
environments where a player issues text commands to effect change in the
environment and progress through the story. We argue that IF games are an
excellent testbed for studying language-based autonomous agents. In particular,
IF games combine challenges of combinatorial action spaces, language
understanding, and commonsense reasoning. To facilitate rapid development of
language-based agents, we introduce Jericho, a learning environment for
man-made IF games and conduct a comprehensive study of text-agents across a
rich set of games, highlighting directions in which agents can improve
Autotelic Agents with Intrinsically Motivated Goal-Conditioned Reinforcement Learning: a Short Survey
Building autonomous machines that can explore open-ended environments,
discover possible interactions and build repertoires of skills is a general
objective of artificial intelligence. Developmental approaches argue that this
can only be achieved by : intrinsically motivated learning
agents that can learn to represent, generate, select and solve their own
problems. In recent years, the convergence of developmental approaches with
deep reinforcement learning (RL) methods has been leading to the emergence of a
new field: . Developmental RL is
concerned with the use of deep RL algorithms to tackle a developmental problem
-- the -
. The self-generation of goals requires the learning
of compact goal encodings as well as their associated goal-achievement
functions. This raises new challenges compared to standard RL algorithms
originally designed to tackle pre-defined sets of goals using external reward
signals. The present paper introduces developmental RL and proposes a
computational framework based on goal-conditioned RL to tackle the
intrinsically motivated skills acquisition problem. It proceeds to present a
typology of the various goal representations used in the literature, before
reviewing existing methods to learn to represent and prioritize goals in
autonomous systems. We finally close the paper by discussing some open
challenges in the quest of intrinsically motivated skills acquisition
Intrinsically Motivated Goal-Conditioned Reinforcement Learning: a Short Survey
Building autonomous machines that can explore open-ended environments, discover possible interactions and autonomously build repertoires of skills is a general objective of artificial intelligence. Developmental approaches argue that this can only be achieved by autonomous and intrinsically motivated learning agents that can generate, select and learn to solve their own problems. In recent years, we have seen a convergence of developmental approaches, and developmental robotics in particular, with deep reinforcement learning (RL) methods, forming the new domain of developmental machine learning. Within this new domain, we review here a set of methods where deep RL algorithms are trained to tackle the developmental robotics problem of the autonomous acquisition of open-ended repertoires of skills. Intrinsically motivated goal-conditioned RL algorithms train agents to learn to represent, generate and pursue their own goals. The self-generation of goals requires the learning of compact goal encodings as well as their associated goal-achievement functions, which results in new challenges compared to traditional RL algorithms designed to tackle pre-defined sets of goals using external reward signals. This paper proposes a typology of these methods at the intersection of deep RL and developmental approaches, surveys recent approaches and discusses future avenues
Generalization on Text-based Games using Structured Belief Representations
Text-based games are complex, interactive simulations where a player is asked to process the text describing the underlying state of the world to issue textual commands for advancing in a game. Playing these games can be formulated as acting in a partially observable Markov decision process (POMDP), as the player needs to issue actions to reach the goal, by optimizing rewards, given textual observations that may not fully describe the underlying state.
Previous art has focused on developing agents to achieve high rewards or faster convergence to the optimal policy for single games. However, with the recent advances in reinforcement learning and representation learning for language we argue it is imperative to start looking for agents that can play a set of games drawn from a distribution of games rather than single games at a time.
In this work, we will be looking at TextWorld as a testbed for developing generalizable policies and benchmarking them against previous work. TextWorld is a sandbox environment for training and evaluating reinforcement learning agents on text-based games.
TextWorld is suitable to check the generalizability of agents as it enables us to generate hundreds of unique games with varying levels of difficulties. Difficulty in text-based games are determined by a variety of factors like the number of locations in the environment and length of the optimal walkthrough to name a few. Playing text-based games requires skills in sequential decision making and processing language. In this thesis we evaluate the learnt control policies by training them on a set of games and then observing their scores on unseen games during the training phase. We check for the quality of the policies learnt, their ability to generalize on a distribution of games and their ability to transfer on games from different distributions. We define game distributions based on the difficulty level parameterized by the number of locations in the game, number of objects, etc.
We propose generalizable and transferrable policies by extracting structured information from the raw textual observations describing the state. Additionally, our agents learn these policies in a purely data-driven fashion without using any handcrafted component -- a common practice found in prior work. Specifically, we learn dynamic knowledge graphs from raw text to represent our agents' beliefs. The dynamic belief graphs a) allow agents to extract relevant information from text observations and, b) act as memory to act optimally in the POMDP. Experiments on 500+ different games from the TextWorld suite show that our best agent outperforms previous baselines by an average of 24.2%