659,210 research outputs found

    Generating Effective Instructions: Knowing When to Stop

    Get PDF
    One aspect of Natural Language generation is describing entities so that they are distinguished from all other entities. Entities include objects, events, actions, and states. Much attention has been paid to objects and the generation of their referring expressions (descriptions meant to pick out or refer to an entity). However, a growing area of research is the automated generation of instruction manuals and an important part of generating instructions is distinguishing the actions that are to be carried out from other possible actions. One distinguishing feature is an action\u27s termination, or when the performance of the action is to stop. My dissertation work focuses on generating action descriptions from action information using the SPUD generation algorithm developed here at Penn by Matthew Stone. In my work, I concentrate on the generation of expressions of termination information as part of action descriptions. The problems I address include how termination information is represented in action information and expressed in Natural Language, how to determine when an action description allows the reader to understand how to perform the action correctly, and how to generate the appropriate description of action information

    ERRA: An Embodied Representation and Reasoning Architecture for Long-horizon Language-conditioned Manipulation Tasks

    Full text link
    This letter introduces ERRA, an embodied learning architecture that enables robots to jointly obtain three fundamental capabilities (reasoning, planning, and interaction) for solving long-horizon language-conditioned manipulation tasks. ERRA is based on tightly-coupled probabilistic inferences at two granularity levels. Coarse-resolution inference is formulated as sequence generation through a large language model, which infers action language from natural language instruction and environment state. The robot then zooms to the fine-resolution inference part to perform the concrete action corresponding to the action language. Fine-resolution inference is constructed as a Markov decision process, which takes action language and environmental sensing as observations and outputs the action. The results of action execution in environments provide feedback for subsequent coarse-resolution reasoning. Such coarse-to-fine inference allows the robot to decompose and achieve long-horizon tasks interactively. In extensive experiments, we show that ERRA can complete various long-horizon manipulation tasks specified by abstract language instructions. We also demonstrate successful generalization to the novel but similar natural language instructions.Comment: Accepted to IEEE Robotics and Automation Letters (RA-L

    Subwords as Skills: Tokenization for Sparse-Reward Reinforcement Learning

    Full text link
    Exploration in sparse-reward reinforcement learning is difficult due to the requirement of long, coordinated sequences of actions in order to achieve any reward. Moreover, in continuous action spaces there are an infinite number of possible actions, which only increases the difficulty of exploration. One class of methods designed to address these issues forms temporally extended actions, often called skills, from interaction data collected in the same domain, and optimizes a policy on top of this new action space. Typically such methods require a lengthy pretraining phase, especially in continuous action spaces, in order to form the skills before reinforcement learning can begin. Given prior evidence that the full range of the continuous action space is not required in such tasks, we propose a novel approach to skill-generation with two components. First we discretize the action space through clustering, and second we leverage a tokenization technique borrowed from natural language processing to generate temporally extended actions. Such a method outperforms baselines for skill-generation in several challenging sparse-reward domains, and requires orders-of-magnitude less computation in skill-generation and online rollouts

    Towards Neural Numeric-To-Text Generation From Temporal Personal Health Data

    Full text link
    With an increased interest in the production of personal health technologies designed to track user data (e.g., nutrient intake, step counts), there is now more opportunity than ever to surface meaningful behavioral insights to everyday users in the form of natural language. This knowledge can increase their behavioral awareness and allow them to take action to meet their health goals. It can also bridge the gap between the vast collection of personal health data and the summary generation required to describe an individual's behavioral tendencies. Previous work has focused on rule-based time-series data summarization methods designed to generate natural language summaries of interesting patterns found within temporal personal health data. We examine recurrent, convolutional, and Transformer-based encoder-decoder models to automatically generate natural language summaries from numeric temporal personal health data. We showcase the effectiveness of our models on real user health data logged in MyFitnessPal and show that we can automatically generate high-quality natural language summaries. Our work serves as a first step towards the ambitious goal of automatically generating novel and meaningful temporal summaries from personal health data.Comment: 5 pages, 2 figures, 1 tabl
    • …
    corecore