18 research outputs found
Towards Socially Intelligent Agents with Mental State Transition and Human Utility
Building a socially intelligent agent involves many challenges, one of which
is to track the agent's mental state transition and teach the agent to make
rational decisions guided by its utility like a human. Towards this end, we
propose to incorporate a mental state parser and utility model into dialogue
agents. The hybrid mental state parser extracts information from both the
dialogue and event observations and maintains a graphical representation of the
agent's mind; Meanwhile, the utility model is a ranking model that learns human
preferences from a crowd-sourced social commonsense dataset, Social IQA.
Empirical results show that the proposed model attains state-of-the-art
performance on the dialogue/action/emotion prediction task in the fantasy
text-adventure game dataset, LIGHT. We also show example cases to demonstrate:
(\textit{i}) how the proposed mental state parser can assist agent's decision
by grounding on the context like locations and objects, and (\textit{ii}) how
the utility model can help the agent make reasonable decisions in a dilemma. To
the best of our knowledge, we are the first work that builds a socially
intelligent agent by incorporating a hybrid mental state parser for both
discrete events and continuous dialogues parsing and human-like utility
modeling
How to Motivate Your Dragon: Teaching Goal-Driven Agents to Speak and Act in Fantasy Worlds
We seek to create agents that both act and communicate with other agents in
pursuit of a goal. Towards this end, we extend LIGHT (Urbanek et al. 2019)---a
large-scale crowd-sourced fantasy text-game---with a dataset of quests. These
contain natural language motivations paired with in-game goals and human
demonstrations; completing a quest might require dialogue or actions (or both).
We introduce a reinforcement learning system that (1) incorporates large-scale
language modeling-based and commonsense reasoning-based pre-training to imbue
the agent with relevant priors; and (2) leverages a factorized action space of
action commands and dialogue, balancing between the two. We conduct zero-shot
evaluations using held-out human expert demonstrations, showing that our agents
are able to act consistently and talk naturally with respect to their
motivations
Interactive Fiction Games: A Colossal Adventure
A hallmark of human intelligence is the ability to understand and communicate
with language. Interactive Fiction games are fully text-based simulation
environments where a player issues text commands to effect change in the
environment and progress through the story. We argue that IF games are an
excellent testbed for studying language-based autonomous agents. In particular,
IF games combine challenges of combinatorial action spaces, language
understanding, and commonsense reasoning. To facilitate rapid development of
language-based agents, we introduce Jericho, a learning environment for
man-made IF games and conduct a comprehensive study of text-agents across a
rich set of games, highlighting directions in which agents can improve
Learning to Follow Instructions in Text-Based Games
Text-based games present a unique class of sequential decision making problem
in which agents interact with a partially observable, simulated environment via
actions and observations conveyed through natural language. Such observations
typically include instructions that, in a reinforcement learning (RL) setting,
can directly or indirectly guide a player towards completing reward-worthy
tasks. In this work, we study the ability of RL agents to follow such
instructions. We conduct experiments that show that the performance of
state-of-the-art text-based game agents is largely unaffected by the presence
or absence of such instructions, and that these agents are typically unable to
execute tasks to completion. To further study and address the task of
instruction following, we equip RL agents with an internal structured
representation of natural language instructions in the form of Linear Temporal
Logic (LTL), a formal language that is increasingly used for temporally
extended reward specification in RL. Our framework both supports and highlights
the benefit of understanding the temporal semantics of instructions and in
measuring progress towards achievement of such a temporally extended behaviour.
Experiments with 500+ games in TextWorld demonstrate the superior performance
of our approach.Comment: NeurIPS 202