2,595 research outputs found
Reinforcement Learning
Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Learning is a very important aspect. This book is on reinforcement learning which involves performing actions to achieve a goal. The first 11 chapters of this book describe and extend the scope of reinforcement learning. The remaining 11 chapters show that there is already wide usage in numerous fields. Reinforcement learning can tackle control tasks that are too complex for traditional, hand-designed, non-learning controllers. As learning computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels. This book shows that reinforcement learning is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field
Code as Policies: Language Model Programs for Embodied Control
Large language models (LLMs) trained on code completion have been shown to be
capable of synthesizing simple Python programs from docstrings [1]. We find
that these code-writing LLMs can be re-purposed to write robot policy code,
given natural language commands. Specifically, policy code can express
functions or feedback loops that process perception outputs (e.g.,from object
detectors [2], [3]) and parameterize control primitive APIs. When provided as
input several example language commands (formatted as comments) followed by
corresponding policy code (via few-shot prompting), LLMs can take in new
commands and autonomously re-compose API calls to generate new policy code
respectively. By chaining classic logic structures and referencing third-party
libraries (e.g., NumPy, Shapely) to perform arithmetic, LLMs used in this way
can write robot policies that (i) exhibit spatial-geometric reasoning, (ii)
generalize to new instructions, and (iii) prescribe precise values (e.g.,
velocities) to ambiguous descriptions ("faster") depending on context (i.e.,
behavioral commonsense). This paper presents code as policies: a robot-centric
formalization of language model generated programs (LMPs) that can represent
reactive policies (e.g., impedance controllers), as well as waypoint-based
policies (vision-based pick and place, trajectory-based control), demonstrated
across multiple real robot platforms. Central to our approach is prompting
hierarchical code-gen (recursively defining undefined functions), which can
write more complex code and also improves state-of-the-art to solve 39.8% of
problems on the HumanEval [1] benchmark. Code and videos are available at
https://code-as-policies.github.i
Learning concurrent motor skills in versatile solution spaces
Future robots need to autonomously acquire motor
skills in order to reduce their reliance on human programming.
Many motor skill learning methods concentrate
on learning a single solution for a given task. However, discarding
information about additional solutions during learning
unnecessarily limits autonomy. Such favoring of single solutions
often requires re-learning of motor skills when the task, the
environment or the robot’s body changes in a way that renders
the learned solution infeasible. Future robots need to be able to
adapt to such changes and, ideally, have a large repertoire of
movements to cope with such problems. In contrast to current
methods, our approach simultaneously learns multiple distinct
solutions for the same task, such that a partial degeneration of
this solution space does not prevent the successful completion
of the task. In this paper, we present a complete framework
that is capable of learning different solution strategies for a
real robot Tetherball task
Automated Reinforcement Learning:An Overview
Reinforcement Learning and recently Deep Reinforcement Learning are popular
methods for solving sequential decision making problems modeled as Markov
Decision Processes. RL modeling of a problem and selecting algorithms and
hyper-parameters require careful considerations as different configurations may
entail completely different performances. These considerations are mainly the
task of RL experts; however, RL is progressively becoming popular in other
fields where the researchers and system designers are not RL experts. Besides,
many modeling decisions, such as defining state and action space, size of
batches and frequency of batch updating, and number of timesteps are typically
made manually. For these reasons, automating different components of RL
framework is of great importance and it has attracted much attention in recent
years. Automated RL provides a framework in which different components of RL
including MDP modeling, algorithm selection and hyper-parameter optimization
are modeled and defined automatically. In this article, we explore the
literature and present recent work that can be used in automated RL. Moreover,
we discuss the challenges, open questions and research directions in AutoRL
Learning Representations from EEG with Deep Recurrent-Convolutional Neural Networks
One of the challenges in modeling cognitive events from electroencephalogram
(EEG) data is finding representations that are invariant to inter- and
intra-subject differences, as well as to inherent noise associated with such
data. Herein, we propose a novel approach for learning such representations
from multi-channel EEG time-series, and demonstrate its advantages in the
context of mental load classification task. First, we transform EEG activities
into a sequence of topology-preserving multi-spectral images, as opposed to
standard EEG analysis techniques that ignore such spatial information. Next, we
train a deep recurrent-convolutional network inspired by state-of-the-art video
classification to learn robust representations from the sequence of images. The
proposed approach is designed to preserve the spatial, spectral, and temporal
structure of EEG which leads to finding features that are less sensitive to
variations and distortions within each dimension. Empirical evaluation on the
cognitive load classification task demonstrated significant improvements in
classification accuracy over current state-of-the-art approaches in this field.Comment: To be published as a conference paper at ICLR 201
The evolution of case grammar
There are few linguistic phenomena that have seduced linguists so skillfully as grammatical case has done. Ever since Panini (4th Century BC), case has claimed a central role in linguistic theory and continues to do so today. However, despite centuries worth of research, case has yet to reveal its most important secrets. This book offers breakthrough explanations for the understanding of case through agent-based experiments in cultural language evolution. The experiments demonstrate that case systems may emerge because they have a selective advantage for communication: they reduce the cognitive effort that listeners need for semantic interpretation, while at the same time limiting the cognitive resources required for doing so
Efficient Learning with Subgoals and Gaussian Process
This thesis demonstrates how data efficiency in reinforcement learning can be improved through the use of subgoals and Gaussian process. Data efficiency is extremely important in a range of problems in which gathering additional data is expensive. This tends to be the case in most problems that involve actual interactions with the physical world, such as a robot kicking a ball, an autonomous vehicle driving or a drone manoeuvring.
State of the art data efficiency is achieved on several well researched problems. The systems that achieve this learn Gaussian process state transition models of the problem. The model based learner system uses the state transition model to learn the action to take in each state. The subgoal planner makes use of the state transition model to build an explicit plan to solve the problem. The subgoal planner is improved through the use of learned subgoals to aid navigation of the problem space.
The resource managed learner balances the costs of computation against the value of selecting better experiments in order to improve data efficiency. An active learning system is used to estimate the value of the experiments in terms of how much they may improve the current solution. This is compared to an estimate of how much better an experiment found by expending additional computation will be along with the costs of performing that computation.
A theoretical framework around the use of subgoals in problem solving is presented. This framework provides insights into when and why subgoals are effective, along with avenues for future research. This includes a detailed proposal for a system built off the subgoal theory framework intended to make full use of subgoals to create an effective reinforcement learning system
- …