10 research outputs found

    Guest Editorial Active Learning and Intrinsically Motivated Exploration in Robots: Advances and Challenges

    Get PDF
    International audienceLEARNING techniques are increasingly being used in today's complex robotic systems. Robots are expected to deal with a large variety of tasks using their high-dimensional and complex bodies, to manipulate objects and also, to interact with humans in an intuitive and friendly way. In this new setting, not all relevant information is available at design time, and robots should typically be able to learn, through self-ex- perimentation or through human–robot interaction, how to tune their innate perceptual-motor skills or to learn, cumulatively, novel skills that were not preprogrammed initially. In a word, robots need to have the capacity to develop in an open-ended manner and in an open-ended environment, in a way that is analogous to human development which combines genetic and epigenetic factors. This challenge is at the center of the developmental robotics field. Among the various technical challenges that are raised by these issues, exploration is paramount. Self-experimentation and learning by interacting with the physical and social world is essential to acquire new knowledge and skills

    Guest Editorial Active Learning and Intrinsically Motivated Exploration in Robots: Advances and Challenges

    Full text link

    Evolutionary Reinforcement Learning: A Survey

    Full text link
    Reinforcement learning (RL) is a machine learning approach that trains agents to maximize cumulative rewards through interactions with environments. The integration of RL with deep learning has recently resulted in impressive achievements in a wide range of challenging tasks, including board games, arcade games, and robot control. Despite these successes, there remain several crucial challenges, including brittle convergence properties caused by sensitive hyperparameters, difficulties in temporal credit assignment with long time horizons and sparse rewards, a lack of diverse exploration, especially in continuous search space scenarios, difficulties in credit assignment in multi-agent reinforcement learning, and conflicting objectives for rewards. Evolutionary computation (EC), which maintains a population of learning agents, has demonstrated promising performance in addressing these limitations. This article presents a comprehensive survey of state-of-the-art methods for integrating EC into RL, referred to as evolutionary reinforcement learning (EvoRL). We categorize EvoRL methods according to key research fields in RL, including hyperparameter optimization, policy search, exploration, reward shaping, meta-RL, and multi-objective RL. We then discuss future research directions in terms of efficient methods, benchmarks, and scalable platforms. This survey serves as a resource for researchers and practitioners interested in the field of EvoRL, highlighting the important challenges and opportunities for future research. With the help of this survey, researchers and practitioners can develop more efficient methods and tailored benchmarks for EvoRL, further advancing this promising cross-disciplinary research field

    Using Reinforcement Learning in the tuning of Central Pattern Generators

    Get PDF
    Dissertação de mestrado em Engenharia InformáticaÉ objetivo deste trabalho aplicar técnicas de Reinforcement Learning em tarefas de aprendizagem e locomoção de robôs. Reinforcement Learning é uma técnica de aprendizagem útil no que diz respeito à locomoção de robôs, devido à ênfase que dá à interação direta entre o agente e o meio ambiente, e ao facto de não exigir supervisão ou modelos completos, ao contrário do que acontece nas abordagens clássicas. O objetivo desta técnica consiste na decisão das ações a tomar, de forma a maximizar uma recompensa cumulativa, tendo em conta o facto de que as decisões podem afetar não só as recompensas imediatas, como também as futuras. Neste trabalho será apresentada a estrutura e funcionamento do Reinforcement Learning e a sua aplicação em Central Pattern Generators, com o objetivo de gerar locomoção adaptativa otimizada. De forma a investigar e identificar os pontos fortes e capacidades do Reinforcement Learning, e para demonstrar de uma forma simples este tipo de algoritmos, foram implementados dois casos de estudo baseados no estado da arte. No que diz respeito ao objetivo principal desta tese, duas soluções diferentes foram abordadas: uma primeira baseada em métodos Natural-Actor Critic, e a segunda, em Cross-Entropy Method. Este último algoritmo provou ser capaz de lidar com a integração das duas abordagens propostas. As soluções de integração foram testadas e validadas com recurso ao simulador Webots e ao modelo do robô DARwIN-OP.In this work, it is intended to apply Reinforcement Learning techniques in tasks involving learning and robot locomotion. Reinforcement Learning is a very useful learning technique with regard to legged robot locomotion, due to its ability to provide direct interaction between the agent and the environment, and the fact of not requiring supervision or complete models, in contrast with other classic approaches. Its aim consists in making decisions about which actions to take so as to maximize a cumulative reward or reinforcement signal, taking into account the fact that the decisions may affect not only the immediate reward, but also the future ones. In this work it will be studied and presented the Reinforcement Learning framework and its application in the tuning of Central Pattern Generators, with the aim of generating optimized robot locomotion. In order to investigate the strengths and abilities of Reinforcement Learning, and to demonstrate in a simple way the learning process of such algorithms, two case studies were implemented based on the state-of-the-art. With regard to the main purpose of the thesis, two different solutions are addressed: a first one based on Natural-Actor Critic methods, and a second, based on the Cross-Entropy Method. This last algorithm was found to be very capable of handling with the integration of the two proposed approaches. The integration solutions were tested and validated resorting to Webots simulation and DARwIN-OP robot model
    corecore