62,854 research outputs found
Improving Grounded Natural Language Understanding through Human-Robot Dialog
Natural language understanding for robotics can require substantial domain-
and platform-specific engineering. For example, for mobile robots to
pick-and-place objects in an environment to satisfy human commands, we can
specify the language humans use to issue such commands, and connect concept
words like red can to physical object properties. One way to alleviate this
engineering for a new domain is to enable robots in human environments to adapt
dynamically---continually learning new language constructions and perceptual
concepts. In this work, we present an end-to-end pipeline for translating
natural language commands to discrete robot actions, and use clarification
dialogs to jointly improve language parsing and concept grounding. We train and
evaluate this agent in a virtual setting on Amazon Mechanical Turk, and we
transfer the learned agent to a physical robot platform to demonstrate it in
the real world
Flexibly Instructable Agents
This paper presents an approach to learning from situated, interactive
tutorial instruction within an ongoing agent. Tutorial instruction is a
flexible (and thus powerful) paradigm for teaching tasks because it allows an
instructor to communicate whatever types of knowledge an agent might need in
whatever situations might arise. To support this flexibility, however, the
agent must be able to learn multiple kinds of knowledge from a broad range of
instructional interactions. Our approach, called situated explanation, achieves
such learning through a combination of analytic and inductive techniques. It
combines a form of explanation-based learning that is situated for each
instruction with a full suite of contextually guided responses to incomplete
explanations. The approach is implemented in an agent called Instructo-Soar
that learns hierarchies of new tasks and other domain knowledge from
interactive natural language instructions. Instructo-Soar meets three key
requirements of flexible instructability that distinguish it from previous
systems: (1) it can take known or unknown commands at any instruction point;
(2) it can handle instructions that apply to either its current situation or to
a hypothetical situation specified in language (as in, for instance,
conditional instructions); and (3) it can learn, from instructions, each class
of knowledge it uses to perform tasks.Comment: See http://www.jair.org/ for any accompanying file
Improv Theater and Artificial Intelligence
Improvisational theater is an art form where unscripted theater is performed. Dialogue, characters, and actions are created on the spot. Errors made within an improvisational theater scene are encouraged, and can form an input to how the scene evolves. Ultimately this project focuses on the evolution and creation of artificial intelligence bots interacting with the world of improv theater. Chatbots Versus Improv Bots A chatbot is a software application used to conduct an online chat conversation via text or text-to-speech, in lieu of providing direct contact with a live human agent. There are many different types of chatbots ranging from a regular expression chatbot like Eliza, who was designed to imitate a therapist, a slot-response chatbot such as Amazon’s Alexa, who responds and acts on commands, or even neural nets like GPT-2 , BERT, or XLNet all of which are used for various elements of natural language processing and text classification tasks. The Artificial Improvisor is a form of artificial conversational agent, or chatbot, focused on open domain dialogue and collaborative narrative generation. Using state-of-the-art machine learning techniques, spanning from natural language processing and speech recognition, to reinforcement and deep learning, these improv bots provide a completely new and exciting asset to this technology that is different from these other types of chatbots. Below is an example of each type of chatbot listed in order from left to right
Recommended from our members
Continually improving grounded natural language understanding through human-robot dialog
As robots become ubiquitous in homes and workplaces such as hospitals and factories, they must be able to communicate with humans. Several kinds of knowledge are required to understand and respond to a human's natural language commands and questions. If a person requests an assistant robot to take me to Alice's office, the robot must know that Alice is a person who owns some unique office, and that take me means it should navigate there. Similarly, if a person requests bring me the heavy, green mug, the robot must have accurate mental models of the physical concepts heavy, green, and mug. To avoid forcing humans to use key phrases or words robots already know, this thesis focuses on helping robots understanding new language constructs through interactions with humans and with the world around them. To understand a command in natural language, a robot must first convert that command to an internal representation that it can reason with. Semantic parsing is a method for performing this conversion, and the target representation is often semantic forms represented as predicate logic with lambda calculus. Traditional semantic parsing relies on hand-crafted resources from a human expert: an ontology of concepts, a lexicon connecting language to those concepts, and training examples of language with abstract meanings. One thrust of this thesis is to perform semantic parsing with sparse initial data. We use the conversations between a robot and human users to induce pairs of natural language utterances with the target semantic forms a robot discovers through its questions, reducing the annotation effort of creating training examples for parsing. We use this data to build more dialog-capable robots in new domains with much less expert human effort (Thomason et al., 2015; Padmakumar et al., 2017). Meanings of many language concepts are bound to the physical world. Understanding object properties and categories, such as heavy, green, and mug requires interacting with and perceiving the physical world. Embodied robots can use manipulation capabilities, such as pushing, picking up, and dropping objects to gather sensory data about them. This data can be used to understand non-visual concepts like heavy and empty (e.g. get the empty carton of milk from the fridge), and assist with concepts that have both visual and non-visual expression (e.g. tall things look big and also exert force sooner than short things when pressed down on). A second thrust of this thesis focuses on strategies for learning these concepts using multi-modal sensory information. We use human-in-the-loop learning to get labels between concept words and actual objects in the environment (Thomason et al., 2016, 2017). We also explore ways to tease out polysemy and synonymy in concept words (Thomason and Mooney, 2017) such as light, which can refer to a weight or a color, the latter sense being synonymous with pale. Additionally, pushing, picking up, and dropping objects to gather sensory information is prohibitively time-consuming, so we investigate strategies for using linguistic information and human input to expedite exploration when learning a new concept (Thomason et al., 2018). Finally, we build an integrated agent with both parsing and perception capabilities that learns from conversations with users to improve both components over time. We demonstrate that parser learning from conversations (Thomason et al., 2015) can be combined with multi-modal perception (Thomason et al., 2016) using predicate-object labels gathered through opportunistic active learning (Thomason et al., 2017) during those conversations to improve performance for understanding natural language commands from humans. Human users also qualitatively rate this integrated learning agent as more usable after it has improved from conversation-based learning.Computer Science
Text-based Adventures of the Golovin AI Agent
The domain of text-based adventure games has been recently established as a
new challenge of creating the agent that is both able to understand natural
language, and acts intelligently in text-described environments.
In this paper, we present our approach to tackle the problem. Our agent,
named Golovin, takes advantage of the limited game domain. We use genre-related
corpora (including fantasy books and decompiled games) to create language
models suitable to this domain. Moreover, we embed mechanisms that allow us to
specify, and separately handle, important tasks as fighting opponents, managing
inventory, and navigating on the game map.
We validated usefulness of these mechanisms, measuring agent's performance on
the set of 50 interactive fiction games. Finally, we show that our agent plays
on a level comparable to the winner of the last year Text-Based Adventure AI
Competition
Language Understanding for Text-based Games Using Deep Reinforcement Learning
In this paper, we consider the task of learning control policies for
text-based games. In these games, all interactions in the virtual world are
through text and the underlying state is not observed. The resulting language
barrier makes such environments challenging for automatic game players. We
employ a deep reinforcement learning framework to jointly learn state
representations and action policies using game rewards as feedback. This
framework enables us to map text descriptions into vector representations that
capture the semantics of the game states. We evaluate our approach on two game
worlds, comparing against baselines using bag-of-words and bag-of-bigrams for
state representations. Our algorithm outperforms the baselines on both worlds
demonstrating the importance of learning expressive representations.Comment: 11 pages, Appearing at EMNLP, 201
- …