659,210 research outputs found
Generating Effective Instructions: Knowing When to Stop
One aspect of Natural Language generation is describing entities so that they are distinguished from all other entities. Entities include objects, events, actions, and states. Much attention has been paid to objects and the generation of their referring expressions (descriptions meant to pick out or refer to an entity). However, a growing area of research is the automated generation of instruction manuals and an important part of generating instructions is distinguishing the actions that are to be carried out from other possible actions. One distinguishing feature is an action\u27s termination, or when the performance of the action is to stop. My dissertation work focuses on generating action descriptions from action information using the SPUD generation algorithm developed here at Penn by Matthew Stone. In my work, I concentrate on the generation of expressions of termination information as part of action descriptions. The problems I address include how termination information is represented in action information and expressed in Natural Language, how to determine when an action description allows the reader to understand how to perform the action correctly, and how to generate the appropriate description of action information
ERRA: An Embodied Representation and Reasoning Architecture for Long-horizon Language-conditioned Manipulation Tasks
This letter introduces ERRA, an embodied learning architecture that enables
robots to jointly obtain three fundamental capabilities (reasoning, planning,
and interaction) for solving long-horizon language-conditioned manipulation
tasks. ERRA is based on tightly-coupled probabilistic inferences at two
granularity levels. Coarse-resolution inference is formulated as sequence
generation through a large language model, which infers action language from
natural language instruction and environment state. The robot then zooms to the
fine-resolution inference part to perform the concrete action corresponding to
the action language. Fine-resolution inference is constructed as a Markov
decision process, which takes action language and environmental sensing as
observations and outputs the action. The results of action execution in
environments provide feedback for subsequent coarse-resolution reasoning. Such
coarse-to-fine inference allows the robot to decompose and achieve long-horizon
tasks interactively. In extensive experiments, we show that ERRA can complete
various long-horizon manipulation tasks specified by abstract language
instructions. We also demonstrate successful generalization to the novel but
similar natural language instructions.Comment: Accepted to IEEE Robotics and Automation Letters (RA-L
Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning
Exploration in sparse-reward reinforcement learning is difficult due to the
requirement of long, coordinated sequences of actions in order to achieve any
reward. Moreover, in continuous action spaces there are an infinite number of
possible actions, which only increases the difficulty of exploration. One class
of methods designed to address these issues forms temporally extended actions,
often called skills, from interaction data collected in the same domain, and
optimizes a policy on top of this new action space. Typically such methods
require a lengthy pretraining phase, especially in continuous action spaces, in
order to form the skills before reinforcement learning can begin. Given prior
evidence that the full range of the continuous action space is not required in
such tasks, we propose a novel approach to skill-generation with two
components. First we discretize the action space through clustering, and second
we leverage a tokenization technique borrowed from natural language processing
to generate temporally extended actions. Such a method outperforms baselines
for skill-generation in several challenging sparse-reward domains, and requires
orders-of-magnitude less computation in skill-generation and online rollouts
Towards Neural Numeric-To-Text Generation From Temporal Personal Health Data
With an increased interest in the production of personal health technologies
designed to track user data (e.g., nutrient intake, step counts), there is now
more opportunity than ever to surface meaningful behavioral insights to
everyday users in the form of natural language. This knowledge can increase
their behavioral awareness and allow them to take action to meet their health
goals. It can also bridge the gap between the vast collection of personal
health data and the summary generation required to describe an individual's
behavioral tendencies. Previous work has focused on rule-based time-series data
summarization methods designed to generate natural language summaries of
interesting patterns found within temporal personal health data. We examine
recurrent, convolutional, and Transformer-based encoder-decoder models to
automatically generate natural language summaries from numeric temporal
personal health data. We showcase the effectiveness of our models on real user
health data logged in MyFitnessPal and show that we can automatically generate
high-quality natural language summaries. Our work serves as a first step
towards the ambitious goal of automatically generating novel and meaningful
temporal summaries from personal health data.Comment: 5 pages, 2 figures, 1 tabl
- …