2,595 research outputs found

    Reinforcement Learning

    Get PDF
    Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Learning is a very important aspect. This book is on reinforcement learning which involves performing actions to achieve a goal. The first 11 chapters of this book describe and extend the scope of reinforcement learning. The remaining 11 chapters show that there is already wide usage in numerous fields. Reinforcement learning can tackle control tasks that are too complex for traditional, hand-designed, non-learning controllers. As learning computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels. This book shows that reinforcement learning is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field

    Code as Policies: Language Model Programs for Embodied Control

    Full text link
    Large language models (LLMs) trained on code completion have been shown to be capable of synthesizing simple Python programs from docstrings [1]. We find that these code-writing LLMs can be re-purposed to write robot policy code, given natural language commands. Specifically, policy code can express functions or feedback loops that process perception outputs (e.g.,from object detectors [2], [3]) and parameterize control primitive APIs. When provided as input several example language commands (formatted as comments) followed by corresponding policy code (via few-shot prompting), LLMs can take in new commands and autonomously re-compose API calls to generate new policy code respectively. By chaining classic logic structures and referencing third-party libraries (e.g., NumPy, Shapely) to perform arithmetic, LLMs used in this way can write robot policies that (i) exhibit spatial-geometric reasoning, (ii) generalize to new instructions, and (iii) prescribe precise values (e.g., velocities) to ambiguous descriptions ("faster") depending on context (i.e., behavioral commonsense). This paper presents code as policies: a robot-centric formalization of language model generated programs (LMPs) that can represent reactive policies (e.g., impedance controllers), as well as waypoint-based policies (vision-based pick and place, trajectory-based control), demonstrated across multiple real robot platforms. Central to our approach is prompting hierarchical code-gen (recursively defining undefined functions), which can write more complex code and also improves state-of-the-art to solve 39.8% of problems on the HumanEval [1] benchmark. Code and videos are available at https://code-as-policies.github.i

    Learning concurrent motor skills in versatile solution spaces

    Get PDF
    Future robots need to autonomously acquire motor skills in order to reduce their reliance on human programming. Many motor skill learning methods concentrate on learning a single solution for a given task. However, discarding information about additional solutions during learning unnecessarily limits autonomy. Such favoring of single solutions often requires re-learning of motor skills when the task, the environment or the robot’s body changes in a way that renders the learned solution infeasible. Future robots need to be able to adapt to such changes and, ideally, have a large repertoire of movements to cope with such problems. In contrast to current methods, our approach simultaneously learns multiple distinct solutions for the same task, such that a partial degeneration of this solution space does not prevent the successful completion of the task. In this paper, we present a complete framework that is capable of learning different solution strategies for a real robot Tetherball task

    Automated Reinforcement Learning:An Overview

    Get PDF
    Reinforcement Learning and recently Deep Reinforcement Learning are popular methods for solving sequential decision making problems modeled as Markov Decision Processes. RL modeling of a problem and selecting algorithms and hyper-parameters require careful considerations as different configurations may entail completely different performances. These considerations are mainly the task of RL experts; however, RL is progressively becoming popular in other fields where the researchers and system designers are not RL experts. Besides, many modeling decisions, such as defining state and action space, size of batches and frequency of batch updating, and number of timesteps are typically made manually. For these reasons, automating different components of RL framework is of great importance and it has attracted much attention in recent years. Automated RL provides a framework in which different components of RL including MDP modeling, algorithm selection and hyper-parameter optimization are modeled and defined automatically. In this article, we explore the literature and present recent work that can be used in automated RL. Moreover, we discuss the challenges, open questions and research directions in AutoRL

    Learning Representations from EEG with Deep Recurrent-Convolutional Neural Networks

    Full text link
    One of the challenges in modeling cognitive events from electroencephalogram (EEG) data is finding representations that are invariant to inter- and intra-subject differences, as well as to inherent noise associated with such data. Herein, we propose a novel approach for learning such representations from multi-channel EEG time-series, and demonstrate its advantages in the context of mental load classification task. First, we transform EEG activities into a sequence of topology-preserving multi-spectral images, as opposed to standard EEG analysis techniques that ignore such spatial information. Next, we train a deep recurrent-convolutional network inspired by state-of-the-art video classification to learn robust representations from the sequence of images. The proposed approach is designed to preserve the spatial, spectral, and temporal structure of EEG which leads to finding features that are less sensitive to variations and distortions within each dimension. Empirical evaluation on the cognitive load classification task demonstrated significant improvements in classification accuracy over current state-of-the-art approaches in this field.Comment: To be published as a conference paper at ICLR 201

    The evolution of case grammar

    Get PDF
    There are few linguistic phenomena that have seduced linguists so skillfully as grammatical case has done. Ever since Panini (4th Century BC), case has claimed a central role in linguistic theory and continues to do so today. However, despite centuries worth of research, case has yet to reveal its most important secrets. This book offers breakthrough explanations for the understanding of case through agent-based experiments in cultural language evolution. The experiments demonstrate that case systems may emerge because they have a selective advantage for communication: they reduce the cognitive effort that listeners need for semantic interpretation, while at the same time limiting the cognitive resources required for doing so

    Efficient Learning with Subgoals and Gaussian Process

    Full text link
    This thesis demonstrates how data efficiency in reinforcement learning can be improved through the use of subgoals and Gaussian process. Data efficiency is extremely important in a range of problems in which gathering additional data is expensive. This tends to be the case in most problems that involve actual interactions with the physical world, such as a robot kicking a ball, an autonomous vehicle driving or a drone manoeuvring. State of the art data efficiency is achieved on several well researched problems. The systems that achieve this learn Gaussian process state transition models of the problem. The model based learner system uses the state transition model to learn the action to take in each state. The subgoal planner makes use of the state transition model to build an explicit plan to solve the problem. The subgoal planner is improved through the use of learned subgoals to aid navigation of the problem space. The resource managed learner balances the costs of computation against the value of selecting better experiments in order to improve data efficiency. An active learning system is used to estimate the value of the experiments in terms of how much they may improve the current solution. This is compared to an estimate of how much better an experiment found by expending additional computation will be along with the costs of performing that computation. A theoretical framework around the use of subgoals in problem solving is presented. This framework provides insights into when and why subgoals are effective, along with avenues for future research. This includes a detailed proposal for a system built off the subgoal theory framework intended to make full use of subgoals to create an effective reinforcement learning system
    • …
    corecore