8 research outputs found

    Efficient Intrinsically Motivated Robotic Grasping with Learning-Adaptive Imagination in Latent Space

    Full text link
    Combining model-based and model-free deep reinforcement learning has shown great promise for improving sample efficiency on complex control tasks while still retaining high performance. Incorporating imagination is a recent effort in this direction inspired by human mental simulation of motor behavior. We propose a learning-adaptive imagination approach which, unlike previous approaches, takes into account the reliability of the learned dynamics model used for imagining the future. Our approach learns an ensemble of disjoint local dynamics models in latent space and derives an intrinsic reward based on learning progress, motivating the controller to take actions leading to data that improves the models. The learned models are used to generate imagined experiences, augmenting the training set of real experiences. We evaluate our approach on learning vision-based robotic grasping and show that it significantly improves sample efficiency and achieves near-optimal performance in a sparse reward environment.Comment: In: Proceedings of the Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics (ICDL-EpiRob), Oslo, Norway, Aug. 19-22, 201

    State Representation Learning for Control: An Overview

    Get PDF
    Representation learning algorithms are designed to learn abstract features that characterize data. State representation learning (SRL) focuses on a particular kind of representation learning where learned features are in low dimension, evolve through time, and are influenced by actions of an agent. The representation is learned to capture the variation in the environment generated by the agent's actions; this kind of representation is particularly suitable for robotics and control scenarios. In particular, the low dimension characteristic of the representation helps to overcome the curse of dimensionality, provides easier interpretation and utilization by humans and can help improve performance and speed in policy learning algorithms such as reinforcement learning. This survey aims at covering the state-of-the-art on state representation learning in the most recent years. It reviews different SRL methods that involve interaction with the environment, their implementations and their applications in robotics control tasks (simulated or real). In particular, it highlights how generic learning objectives are differently exploited in the reviewed algorithms. Finally, it discusses evaluation methods to assess the representation learned and summarizes current and future lines of research

    Learning to reach and reaching to learn: a unified approach to path planning and reactive control through reinforcement learning

    Get PDF
    The next generation of intelligent robots will need to be able to plan reaches. Not just ballistic point to point reaches, but reaches around things such as the edge of a table, a nearby human, or any other known object in the robot’s workspace. Planning reaches may seem easy to us humans, because we do it so intuitively, but it has proven to be a challenging problem, which continues to limit the versatility of what robots can do today. In this document, I propose a novel intrinsically motivated RL system that draws on both Path/Motion Planning and Reactive Control. Through Reinforcement Learning, it tightly integrates these two previously disparate approaches to robotics. The RL system is evaluated on a task, which is as yet unsolved by roboticists in practice. That is to put the palm of the iCub humanoid robot on arbitrary target objects in its workspace, start- ing from arbitrary initial configurations. Such motions can be generated by planning, or searching the configuration space, but this typically results in some kind of trajectory, which must then be tracked by a separate controller, and such an approach offers a brit- tle runtime solution because it is inflexible. Purely reactive systems are robust to many problems that render a planned trajectory infeasible, but lacking the capacity to search, they tend to get stuck behind constraints, and therefore do not replace motion planners. The planner/controller proposed here is novel in that it deliberately plans reaches without the need to track trajectories. Instead, reaches are composed of sequences of reactive motion primitives, implemented by my Modular Behavioral Environment (MoBeE), which provides (fictitious) force control with reactive collision avoidance by way of a realtime kinematic/geometric model of the robot and its workspace. Thus, to the best of my knowledge, mine is the first reach planning approach to simultaneously offer the best of both the Path/Motion Planning and Reactive Control approaches. By controlling the real, physical robot directly, and feeling the influence of the con- straints imposed by MoBeE, the proposed system learns a stochastic model of the iCub’s configuration space. Then, the model is exploited as a multiple query path planner to find sensible pre-reach poses, from which to initiate reaching actions. Experiments show that the system can autonomously find practical reaches to target objects in workspace and offers excellent robustness to changes in the workspace configuration as well as noise in the robot’s sensory-motor apparatus

    A Search For Principles of Basal Ganglia Function

    Get PDF
    The basal ganglia are a group of subcortical nuclei that contain about 100 million neurons in humans. Different modes of basal ganglia dysfunction lead to Parkinson's disease and Huntington's disease, which have debilitating motor and cognitive symptoms. However, despite intensive study, both the internal computational mechanisms of the basal ganglia, and their contribution to normal brain function, have been elusive. The goal of this thesis is to identify basic principles that underlie basal ganglia function, with a focus on signal representation, computation, dynamics, and plasticity. This process begins with a review of two current hypotheses of normal basal ganglia function, one being that they automatically select actions on the basis of past reinforcement, and the other that they compress cortical signals that tend to occur in conjunction with reinforcement. It is argued that a wide range of experimental data are consistent with these mechanisms operating in series, and that in this configuration, compression makes selection practical in natural environments. Although experimental work is outside the present scope, an experimental means of testing this proposal in the future is suggested. The remainder of the thesis builds on Eliasmith & Anderson's Neural Engineering Framework (NEF), which provides an integrated theoretical account of computation, representation, and dynamics in large neural circuits. The NEF provides considerable insight into basal ganglia function, but its explanatory power is potentially limited by two assumptions that the basal ganglia violate. First, like most large-network models, the NEF assumes that neurons integrate multiple synaptic inputs in a linear manner. However, synaptic integration in the basal ganglia is nonlinear in several respects. Three modes of nonlinearity are examined, including nonlinear interactions between dendritic branches, nonlinear integration within terminal branches, and nonlinear conductance-current relationships. The first mode is shown to affect neuron tuning. The other two modes are shown to enable alternative computational mechanisms that facilitate learning, and make computation more flexible, respectively. Secondly, while the NEF assumes that the feedforward dynamics of individual neurons are dominated by the dynamics of post-synaptic current, many basal ganglia neurons also exhibit prominent spike-generation dynamics, including adaptation, bursting, and hysterses. Of these, it is shown that the NEF theory of network dynamics applies fairly directly to certain cases of firing-rate adaptation. However, more complex dynamics, including nonlinear dynamics that are diverse across a population, can be described using the NEF equations for representation. In particular, a neuron's response can be characterized in terms of a more complex function that extends over both present and past inputs. It is therefore straightforward to apply NEF methods to interpret the effects of complex cell dynamics at the network level. The role of spike timing in basal ganglia function is also examined. Although the basal ganglia have been interpreted in the past to perform computations on the basis of mean firing rates (over windows of tens or hundreds of milliseconds) it has recently become clear that patterns of spikes on finer timescales are also functionally relevant. Past work has shown that precise spike times in sensory systems contain stimulus-related information, but there has been little study of how post-synaptic neurons might use this information. It is shown that essentially any neuron can use this information to perform flexible computations, and that these computations do not require spike timing that is very precise. As a consequence, irregular and highly-variable firing patterns can drive behaviour with which they have no detectable correlation. Most of the projection neurons in the basal ganglia are inhibitory, and the effect of one nucleus on another is classically interpreted as subtractive or divisive. Theoretically, very flexible computations can be performed within a projection if each presynaptic neuron can both excite and inhibit its targets, but this is hardly ever the case physiologically. However, it is shown here that equivalent computational flexibility is supported by inhibitory projections in the basal ganglia, as a simple consequence of inhibitory collaterals in the target nuclei. Finally, the relationship between population coding and synaptic plasticity is discussed. It is shown that Hebbian plasticity, in conjunction with lateral connections, determines both the dimension of the population code and the tuning of neuron responses within the coded space. These results permit a straightforward interpretation of the effects of synaptic plasticity on information processing at the network level. Together with the NEF, these new results provide a rich set of theoretical principles through which the dominant physiological factors that affect basal ganglia function can be more clearly understood

    Brain Computations and Connectivity [2nd edition]

    Get PDF
    This is an open access title available under the terms of a CC BY-NC-ND 4.0 International licence. It is free to read on the Oxford Academic platform and offered as a free PDF download from OUP and selected open access locations. Brain Computations and Connectivity is about how the brain works. In order to understand this, it is essential to know what is computed by different brain systems; and how the computations are performed. The aim of this book is to elucidate what is computed in different brain systems; and to describe current biologically plausible computational approaches and models of how each of these brain systems computes. Understanding the brain in this way has enormous potential for understanding ourselves better in health and in disease. Potential applications of this understanding are to the treatment of the brain in disease; and to artificial intelligence which will benefit from knowledge of how the brain performs many of its extraordinarily impressive functions. This book is pioneering in taking this approach to brain function: to consider what is computed by many of our brain systems; and how it is computed, and updates by much new evidence including the connectivity of the human brain the earlier book: Rolls (2021) Brain Computations: What and How, Oxford University Press. Brain Computations and Connectivity will be of interest to all scientists interested in brain function and how the brain works, whether they are from neuroscience, or from medical sciences including neurology and psychiatry, or from the area of computational science including machine learning and artificial intelligence, or from areas such as theoretical physics
    corecore