3,041 research outputs found

    Online hyper-evolution of controllers in multirobot systems

    Get PDF
    In this paper, we introduce online hyper-evolution (OHE) to accelerate and increase the performance of online evolution of robotic controllers. Robots executing OHE use the different sources of feedback information traditionally associated with controller evaluation to find effective evolutionary algorithms and controllers online during task execution. We present two approaches: OHE-fitness, which uses the fitness score of controllers as the criterion to select promising algorithms over time, and OHE-diversity, which relies on the behavioural diversity of controllers for algorithm selection. Both OHE-fitness and OHE-diversity are distributed across groups of robots that evolve in parallel. We assess the performance of OHE-fitness and of OHE-diversity in two foraging tasks with differing complexity, and in five configurations of a dynamic phototaxis task with varying evolutionary pressures. Results show that our OHE approaches: (i) outperform multiple state-of-the-art algorithms as they facilitate controllers with superior performance and faster evolution of solutions, and (ii) can increase effectiveness at different stages of evolution by combining the benefits of multiple algorithms over time. Overall, our study shows that OHE is an effective new paradigm to the synthesis of controllers for robots.info:eu-repo/semantics/acceptedVersio

    Using Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics

    Get PDF
    The most data-efficient algorithms for reinforcement learning in robotics are model-based policy search algorithms, which alternate between learning a dynamical model of the robot and optimizing a policy to maximize the expected return given the model and its uncertainties. Among the few proposed approaches, the recently introduced Black-DROPS algorithm exploits a black-box optimization algorithm to achieve both high data-efficiency and good computation times when several cores are used; nevertheless, like all model-based policy search approaches, Black-DROPS does not scale to high dimensional state/action spaces. In this paper, we introduce a new model learning procedure in Black-DROPS that leverages parameterized black-box priors to (1) scale up to high-dimensional systems, and (2) be robust to large inaccuracies of the prior information. We demonstrate the effectiveness of our approach with the "pendubot" swing-up task in simulation and with a physical hexapod robot (48D state space, 18D action space) that has to walk forward as fast as possible. The results show that our new algorithm is more data-efficient than previous model-based policy search algorithms (with and without priors) and that it can allow a physical 6-legged robot to learn new gaits in only 16 to 30 seconds of interaction time.Comment: Accepted at ICRA 2018; 8 pages, 4 figures, 2 algorithms, 1 table; Video at https://youtu.be/HFkZkhGGzTo ; Spotlight ICRA presentation at https://youtu.be/_MZYDhfWeL

    Meta Reinforcement Learning with Latent Variable Gaussian Processes

    Get PDF
    Learning from small data sets is critical in many practical applications where data collection is time consuming or expensive, e.g., robotics, animal experiments or drug design. Meta learning is one way to increase the data efficiency of learning algorithms by generalizing learned concepts from a set of training tasks to unseen, but related, tasks. Often, this relationship between tasks is hard coded or relies in some other way on human expertise. In this paper, we frame meta learning as a hierarchical latent variable model and infer the relationship between tasks automatically from data. We apply our framework in a model-based reinforcement learning setting and show that our meta-learning model effectively generalizes to novel tasks by identifying how new tasks relate to prior ones from minimal data. This results in up to a 60% reduction in the average interaction time needed to solve tasks compared to strong baselines.Comment: 11 pages, 7 figure

    Evolutionary online behaviour learning and adaptation in real robots

    Get PDF
    Online evolution of behavioural control on real robots is an open-ended approach to autonomous learning and adaptation: robots have the potential to automatically learn new tasks and to adapt to changes in environmental conditions, or to failures in sensors and/or actuators. However, studies have so far almost exclusively been carried out in simulation because evolution in real hardware has required several days or weeks to produce capable robots. In this article, we successfully evolve neural network-based controllers in real robotic hardware to solve two single-robot tasks and one collective robotics task. Controllers are evolved either from random solutions or from solutions pre-evolved in simulation. In all cases, capable solutions are found in a timely manner (1 h or less). Results show that more accurate simulations may lead to higher-performing controllers, and that completing the optimization process in real robots is meaningful, even if solutions found in simulation differ from solutions in reality. We furthermore demonstrate for the first time the adaptive capabilities of online evolution in real robotic hardware, including robots able to overcome faults injected in the motors of multiple units simultaneously, and to modify their behaviour in response to changes in the task requirements. We conclude by assessing the contribution of each algorithmic component on the performance of the underlying evolutionary algorithm.info:eu-repo/semantics/publishedVersio

    Active Inference for Integrated State-Estimation, Control, and Learning

    Full text link
    This work presents an approach for control, state-estimation and learning model (hyper)parameters for robotic manipulators. It is based on the active inference framework, prominent in computational neuroscience as a theory of the brain, where behaviour arises from minimizing variational free-energy. The robotic manipulator shows adaptive and robust behaviour compared to state-of-the-art methods. Additionally, we show the exact relationship to classic methods such as PID control. Finally, we show that by learning a temporal parameter and model variances, our approach can deal with unmodelled dynamics, damps oscillations, and is robust against disturbances and poor initial parameters. The approach is validated on the `Franka Emika Panda' 7 DoF manipulator.Comment: 7 pages, 6 figures, accepted for presentation at the International Conference on Robotics and Automation (ICRA) 202
    • …
    corecore