167 research outputs found

    Intrinsic Motivation and Mental Replay enable Efficient Online Adaptation in Stochastic Recurrent Networks

    Full text link
    Autonomous robots need to interact with unknown, unstructured and changing environments, constantly facing novel challenges. Therefore, continuous online adaptation for lifelong-learning and the need of sample-efficient mechanisms to adapt to changes in the environment, the constraints, the tasks, or the robot itself are crucial. In this work, we propose a novel framework for probabilistic online motion planning with online adaptation based on a bio-inspired stochastic recurrent neural network. By using learning signals which mimic the intrinsic motivation signalcognitive dissonance in addition with a mental replay strategy to intensify experiences, the stochastic recurrent network can learn from few physical interactions and adapts to novel environments in seconds. We evaluate our online planning and adaptation framework on an anthropomorphic KUKA LWR arm. The rapid online adaptation is shown by learning unknown workspace constraints sample-efficiently from few physical interactions while following given way points.Comment: accepted in Neural Network

    A brief review of neural networks based learning and control and their applications for robots

    Get PDF
    As an imitation of the biological nervous systems, neural networks (NN), which are characterized with powerful learning ability, have been employed in a wide range of applications, such as control of complex nonlinear systems, optimization, system identification and patterns recognition etc. This article aims to bring a brief review of the state-of-art NN for the complex nonlinear systems. Recent progresses of NNs in both theoretical developments and practical applications are investigated and surveyed. Specifically, NN based robot learning and control applications were further reviewed, including NN based robot manipulator control, NN based human robot interaction and NN based behavior recognition and generation

    Towards a unified approach

    Get PDF
    "Decision-making in the presence of uncertainty is a pervasive computation. Latent variable decoding—inferring hidden causes underlying visible effects—is commonly observed in nature, and it is an unsolved challenge in modern machine learning. On many occasions, animals need to base their choices on uncertain evidence; for instance, when deciding whether to approach or avoid an obfuscated visual stimulus that could be either a prey or a predator. Yet, their strategies are, in general, poorly understood. In simple cases, these problems admit an optimal, explicit solution. However, in more complex real-life scenarios, it is difficult to determine the best possible behavior. The most common approach in modern machine learning relies on artificial neural networks—black boxes that map each input to an output. This input-output mapping depends on a large number of parameters, the weights of the synaptic connections, which are optimized during learning.(...)

    Minimizing Control for Credit Assignment with Strong Feedback

    Full text link
    The success of deep learning ignited interest in whether the brain learns hierarchical representations using gradient-based learning. However, current biologically plausible methods for gradient-based credit assignment in deep neural networks need infinitesimally small feedback signals, which is problematic in biologically realistic noisy environments and at odds with experimental evidence in neuroscience showing that top-down feedback can significantly influence neural activity. Building upon deep feedback control (DFC), a recently proposed credit assignment method, we combine strong feedback influences on neural activity with gradient-based learning and show that this naturally leads to a novel view on neural network optimization. Instead of gradually changing the network weights towards configurations with low output loss, weight updates gradually minimize the amount of feedback required from a controller that drives the network to the supervised output label. Moreover, we show that the use of strong feedback in DFC allows learning forward and feedback connections simultaneously, using learning rules fully local in space and time. We complement our theoretical results with experiments on standard computer-vision benchmarks, showing competitive performance to backpropagation as well as robustness to noise. Overall, our work presents a fundamentally novel view of learning as control minimization, while sidestepping biologically unrealistic assumptions

    Model-based reinforcement learning and navigation in animals and machines

    Get PDF
    For decades, neuroscientists and psychologists have observed that animal performance on spatial navigation tasks suggests an internal learned map of the environment. More recently, map-based (or model-based) reinforcement learning has become a highly active research area in machine learning. With a learned model of their environment, both animals and artificial agents can generalize between tasks and learn rapidly. In this thesis, I present approaches for developing efficient model--based behaviour in machines and explaining model--based behaviour in animals. From a neuroscience perspective, I focus on the hippocampus, believed to be a major substrate of model-based behaviour in the brain. I consider how hippocampal connectivity enable path--finding between different locations in an environment. The model describes how environments with boundaries and barriers can be represented in recurrent neural networks (i.e. attractor networks), and how the transient activity in these networks, after being stimulated with a goal location, could be used for determining a path to the goal. I also propose how the connectivity of these map--like networks can be learned from the spatial firing patterns observed in the input pathway to the hippocampus (i.e. grid cells and border cells). From a machine learning perspective, I describe a reinforcement learning model that integrates model-based methods and "episodic control", an approach to reinforcement learning based on episodic memory. According to episodic control, the agent learns how to act in the environment by storing snapshot-like memories of its observations, then comparing its current observations to similar snapshot memories where it took an action that resulted in high reward. In our approach, the agent augments these real-world memories with episodes simulated offline using a learned model of the environment. These ``simulated memories'' allow the agent to adapt faster when the reward locations change. Next, I describe Variational State Tabulation (VaST), a model--based method for learning quickly with continuous and high-dimensional observations (like those found in 3D navigation tasks). The VaST agent learns to map its observations to a limited number of discrete abstract states, and build a transition model over those abstract states. The long--term values of different actions in each state are updated continuously and efficiently in the background as the agent explores the environment. I show how the VaST agent can learn faster than other state-of-the-art algorithms, even changing its policy after a single new experience, and how it can respond quickly to changing rewards in complex 3D environments. The models I present allow the agent to rapidly adapt to changing goals and rewards, a key component of intelligence. They use a combination of features attributed to model-based and episodic controllers, suggesting that the division between the two fields is not strict. I therefore also consider the consequences of these findings on theories of model-based learning, episodic control and hippocampal function
    • …
    corecore