Search CORE

13 research outputs found

Sample Efficiency Improvement on Neuroevolution via Estimation-Based Elimination Strategy (Extended Abstract)

Author: Hirotaka Moriguchi
Shengbo Xu
Shinichi Honiden
Yuki Inoue
Publication venue
Publication date: 11/04/2020
Field of study

ABSTRACT In this paper, we propose estimation-based elimination strategy, which improves sample efficiency of NeuroEvolution (NE) algorithms. The fitness of new individuals was estimated using fitness of individuals evaluated in the past generations. The estimation was achieved by taking average fitness of individuals with high correlation with the new individual. Estimation-based elimination strategy avoids evaluating individuals with low estimated fitness. We adapt estimationbased elimination strategy for state-of-the-art NE algorithms: CMA-NeuroES and CMA-TWEANN. From the experimental results of pole-balancing benchmark tasks, we show that the proposed strategy improves sample efficiency of the NE algorithms

CiteSeerX

A comparison of action selection methods for implicit policy method reinforcement learning in continuous action-space

Author: Nichols Barry D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/11/2016
Field of study

In this paper I investigate methods of applying reinforcement learning to continuous state- and action-space problems without a policy function. I compare the performance of four methods, one of which is the discretisation of the action-space, and the other three are optimisation techniques applied to finding the greedy action without discretisation. The optimisation methods I apply are gradient descent, Nelder-Mead and Newton's Method. The action selection methods are applied in conjunction with the SARSA algorithm, with a multilayer perceptron utilized for the approximation of the value function. The approaches are applied to two simulated continuous state- and action-space control problems: Cart-Pole and double Cart-Pole. The results are compared both in terms of action selection time and the number of trials required to train on the benchmark problems

Crossref

Middlesex University Research Repository

A comparison of action selection methods for implicit policy method reinforcement learning in continuous action-space

Author: Nichols B.
Nichols B.
Publication venue: IEEE
Publication date: 01/01/2016
Field of study

Middlesex University Research Repository

Fast Damage Recovery in Robotics with the T-Resilience Algorithm

Author: Cully Antoine
Koos Sylvain
Mouret Jean-Baptiste
Publication venue: 'SAGE Publications'
Publication date: 02/02/2013
Field of study

Damage recovery is critical for autonomous robots that need to operate for a long time without assistance. Most current methods are complex and costly because they require anticipating each potential damage in order to have a contingency plan ready. As an alternative, we introduce the T-resilience algorithm, a new algorithm that allows robots to quickly and autonomously discover compensatory behaviors in unanticipated situations. This algorithm equips the robot with a self-model and discovers new behaviors by learning to avoid those that perform differently in the self-model and in reality. Our algorithm thus does not identify the damaged parts but it implicitly searches for efficient behaviors that do not use them. We evaluate the T-Resilience algorithm on a hexapod robot that needs to adapt to leg removal, broken legs and motor failures; we compare it to stochastic local search, policy gradient and the self-modeling algorithm proposed by Bongard et al. The behavior of the robot is assessed on-board thanks to a RGB-D sensor and a SLAM algorithm. Using only 25 tests on the robot and an overall running time of 20 minutes, T-Resilience consistently leads to substantially better results than the other approaches

arXiv.org e-Print Archive

Crossref

HAL Descartes

Spiral - Imperial College Digital Repository

Hal-Diderot

Evolving the behavior of machines: from micro to macroevolution

Author: Mouret Jean-Baptiste
Publication venue: 'Elsevier BV'
Publication date: 01/10/2020
Field of study

International audienceEvolution gave rise to creatures that are arguably more sophisticated than the greatest human-designed systems. This feat has inspired computer scientists since the advent of computing and led to optimization tools that can evolve complex neural networks for machines-an approach known as "neuroevolution". After a few successes in designing evolvable representations for high-dimensional artifacts, the field has been recently revitalized by going beyond optimization: to many, the wonder of evolution is less in the perfect optimization of each species than in the creativity of such a simple iterative process, that is, in the diversity of species. This modern view of artificial evolution is moving the field away from microevolution, following a fitness gradient in a niche, to macroevolution, filling many niches with highly different species. It already opened promising applications, like evolving gait repertoires, video game levels for different tastes, and diverse designs for aerodynamic bikes

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-Rennes 1

Modeling the Evolution of Beliefs Using an Attentional Focus Mechanism

Author: Bossaerts Peter
Gläscher Jan
Kiebel Stefan J.
Marković Dimitrije
O'Doherty John P.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/10/2015
Field of study

For making decisions in everyday life we often have first to infer the set of environmental features that are relevant for the current task. Here we investigated the computational mechanisms underlying the evolution of beliefs about the relevance of environmental features in a dynamical and noisy environment. For this purpose we designed a probabilistic Wisconsin card sorting task (WCST) with belief solicitation, in which subjects were presented with stimuli composed of multiple visual features. At each moment in time a particular feature was relevant for obtaining reward, and participants had to infer which feature was relevant and report their beliefs accordingly. To test the hypothesis that attentional focus modulates the belief update process, we derived and fitted several probabilistic and non-probabilistic behavioral models, which either incorporate a dynamical model of attentional focus, in the form of a hierarchical winner-take-all neuronal network, or a diffusive model, without attention-like features. We used Bayesian model selection to identify the most likely generative model of subjects’ behavior and found that attention-like features in the behavioral model are essential for explaining subjects’ responses. Furthermore, we demonstrate a method for integrating both connectionist and Bayesian models of decision making within a single framework that allowed us to infer hidden belief processes of human subjects

Directory of Open Access Journals

PubMed Central

Caltech Authors

MPG.PuRe

University of Melbourne Institutional Repository

FigShare

Evolutionary Reinforcement Learning: A Survey

Author: Bai Hui
Cheng Ran
Jin Yaochu
Publication venue
Publication date: 10/03/2023
Field of study

Reinforcement learning (RL) is a machine learning approach that trains agents to maximize cumulative rewards through interactions with environments. The integration of RL with deep learning has recently resulted in impressive achievements in a wide range of challenging tasks, including board games, arcade games, and robot control. Despite these successes, there remain several crucial challenges, including brittle convergence properties caused by sensitive hyperparameters, difficulties in temporal credit assignment with long time horizons and sparse rewards, a lack of diverse exploration, especially in continuous search space scenarios, difficulties in credit assignment in multi-agent reinforcement learning, and conflicting objectives for rewards. Evolutionary computation (EC), which maintains a population of learning agents, has demonstrated promising performance in addressing these limitations. This article presents a comprehensive survey of state-of-the-art methods for integrating EC into RL, referred to as evolutionary reinforcement learning (EvoRL). We categorize EvoRL methods according to key research fields in RL, including hyperparameter optimization, policy search, exploration, reward shaping, meta-RL, and multi-objective RL. We then discuss future research directions in terms of efficient methods, benchmarks, and scalable platforms. This survey serves as a resource for researchers and practitioners interested in the field of EvoRL, highlighting the important challenges and opportunities for future research. With the help of this survey, researchers and practitioners can develop more efficient methods and tailored benchmarks for EvoRL, further advancing this promising cross-disciplinary research field

arXiv.org e-Print Archive