373 research outputs found

    Evolutionary Reinforcement Learning: A Survey

    Full text link
    Reinforcement learning (RL) is a machine learning approach that trains agents to maximize cumulative rewards through interactions with environments. The integration of RL with deep learning has recently resulted in impressive achievements in a wide range of challenging tasks, including board games, arcade games, and robot control. Despite these successes, there remain several crucial challenges, including brittle convergence properties caused by sensitive hyperparameters, difficulties in temporal credit assignment with long time horizons and sparse rewards, a lack of diverse exploration, especially in continuous search space scenarios, difficulties in credit assignment in multi-agent reinforcement learning, and conflicting objectives for rewards. Evolutionary computation (EC), which maintains a population of learning agents, has demonstrated promising performance in addressing these limitations. This article presents a comprehensive survey of state-of-the-art methods for integrating EC into RL, referred to as evolutionary reinforcement learning (EvoRL). We categorize EvoRL methods according to key research fields in RL, including hyperparameter optimization, policy search, exploration, reward shaping, meta-RL, and multi-objective RL. We then discuss future research directions in terms of efficient methods, benchmarks, and scalable platforms. This survey serves as a resource for researchers and practitioners interested in the field of EvoRL, highlighting the important challenges and opportunities for future research. With the help of this survey, researchers and practitioners can develop more efficient methods and tailored benchmarks for EvoRL, further advancing this promising cross-disciplinary research field

    Self-adaptive fitness in evolutionary processes

    Get PDF
    Most optimization algorithms or methods in artificial intelligence can be regarded as evolutionary processes. They start from (basically) random guesses and produce increasingly better results with respect to a given target function, which is defined by the process's designer. The value of the achieved results is communicated to the evolutionary process via a fitness function that is usually somewhat correlated with the target function but does not need to be exactly the same. When the values of the fitness function change purely for reasons intrinsic to the evolutionary process, i.e., even though the externally motivated goals (as represented by the target function) remain constant, we call that phenomenon self-adaptive fitness. We trace the phenomenon of self-adaptive fitness back to emergent goals in artificial chemistry systems, for which we develop a new variant based on neural networks. We perform an in-depth analysis of diversity-aware evolutionary algorithms as a prime example of how to effectively integrate self-adaptive fitness into evolutionary processes. We sketch the concept of productive fitness as a new tool to reason about the intrinsic goals of evolution. We introduce the pattern of scenario co-evolution, which we apply to a reinforcement learning agent competing against an evolutionary algorithm to improve performance and generate hard test cases and which we also consider as a more general pattern for software engineering based on a solid formal framework. Multiple connections to related topics in natural computing, quantum computing and artificial intelligence are discovered and may shape future research in the combined fields.Die meisten Optimierungsalgorithmen und die meisten Verfahren in Bereich künstlicher Intelligenz können als evolutionäre Prozesse aufgefasst werden. Diese beginnen mit (prinzipiell) zufällig geratenen Lösungskandidaten und erzeugen dann immer weiter verbesserte Ergebnisse für gegebene Zielfunktion, die der Designer des gesamten Prozesses definiert hat. Der Wert der erreichten Ergebnisse wird dem evolutionären Prozess durch eine Fitnessfunktion mitgeteilt, die normalerweise in gewissem Rahmen mit der Zielfunktion korreliert ist, aber auch nicht notwendigerweise mit dieser identisch sein muss. Wenn die Werte der Fitnessfunktion sich allein aus für den evolutionären Prozess intrinsischen Gründen ändern, d.h. auch dann, wenn die extern motivierten Ziele (repräsentiert durch die Zielfunktion) konstant bleiben, nennen wir dieses Phänomen selbst-adaptive Fitness. Wir verfolgen das Phänomen der selbst-adaptiven Fitness zurück bis zu künstlichen Chemiesystemen (artificial chemistry systems), für die wir eine neue Variante auf Basis neuronaler Netze entwickeln. Wir führen eine tiefgreifende Analyse diversitätsbewusster evolutionärer Algorithmen durch, welche wir als Paradebeispiel für die effektive Integration von selbst-adaptiver Fitness in evolutionäre Prozesse betrachten. Wir skizzieren das Konzept der produktiven Fitness als ein neues Werkzeug zur Untersuchung von intrinsischen Zielen der Evolution. Wir führen das Muster der Szenarien-Ko-Evolution (scenario co-evolution) ein und wenden es auf einen Agenten an, der mittels verstärkendem Lernen (reinforcement learning) mit einem evolutionären Algorithmus darum wetteifert, seine Leistung zu erhöhen bzw. härtere Testszenarien zu finden. Wir erkennen dieses Muster auch in einem generelleren Kontext als formale Methode in der Softwareentwicklung. Wir entdecken mehrere Verbindungen der besprochenen Phänomene zu Forschungsgebieten wie natural computing, quantum computing oder künstlicher Intelligenz, welche die zukünftige Forschung in den kombinierten Forschungsgebieten prägen könnten

    Machine learning into metaheuristics: A survey and taxonomy of data-driven metaheuristics

    Get PDF
    During the last years, research in applying machine learning (ML) to design efficient, effective and robust metaheuristics became increasingly popular. Many of those data driven metaheuristics have generated high quality results and represent state-of-the-art optimization algorithms. Although various appproaches have been proposed, there is a lack of a comprehensive survey and taxonomy on this research topic. In this paper we will investigate different opportunities for using ML into metaheuristics. We define uniformly the various ways synergies which might be achieved. A detailed taxonomy is proposed according to the concerned search component: target optimization problem, low-level and high-level components of metaheuristics. Our goal is also to motivate researchers in optimization to include ideas from ML into metaheuristics. We identify some open research issues in this topic which needs further in-depth investigations

    A Data-Driven Evolutionary Transfer Optimization for Expensive Problems in Dynamic Environments

    Full text link
    Many real-world problems are usually computationally costly and the objective functions evolve over time. Data-driven, a.k.a. surrogate-assisted, evolutionary optimization has been recognized as an effective approach for tackling expensive black-box optimization problems in a static environment whereas it has rarely been studied under dynamic environments. This paper proposes a simple but effective transfer learning framework to empower data-driven evolutionary optimization to solve dynamic optimization problems. Specifically, it applies a hierarchical multi-output Gaussian process to capture the correlation between data collected from different time steps with a linearly increased number of hyperparameters. Furthermore, an adaptive source task selection along with a bespoke warm staring initialization mechanisms are proposed to better leverage the knowledge extracted from previous optimization exercises. By doing so, the data-driven evolutionary optimization can jump start the optimization in the new environment with a strictly limited computational budget. Experiments on synthetic benchmark test problems and a real-world case study demonstrate the effectiveness of our proposed algorithm against nine state-of-the-art peer algorithms

    Evolutionary computation for expensive optimization: a survey

    Get PDF
    Expensive optimization problem (EOP) widely exists in various significant real-world applications. However, EOP requires expensive or even unaffordable costs for evaluating candidate solutions, which is expensive for the algorithm to find a satisfactory solution. Moreover, due to the fast-growing application demands in the economy and society, such as the emergence of the smart cities, the internet of things, and the big data era, solving EOP more efficiently has become increasingly essential in various fields, which poses great challenges on the problem-solving ability of optimization approach for EOP. Among various optimization approaches, evolutionary computation (EC) is a promising global optimization tool widely used for solving EOP efficiently in the past decades. Given the fruitful advancements of EC for EOP, it is essential to review these advancements in order to synthesize and give previous research experiences and references to aid the development of relevant research fields and real-world applications. Motivated by this, this paper aims to provide a comprehensive survey to show why and how EC can solve EOP efficiently. For this aim, this paper firstly analyzes the total optimization cost of EC in solving EOP. Then, based on the analysis, three promising research directions are pointed out for solving EOP, which are problem approximation and substitution, algorithm design and enhancement, and parallel and distributed computation. Note that, to the best of our knowledge, this paper is the first that outlines the possible directions for efficiently solving EOP by analyzing the total expensive cost. Based on this, existing works are reviewed comprehensively via a taxonomy with four parts, including the above three research directions and the real-world application part. Moreover, some future research directions are also discussed in this paper. It is believed that such a survey can attract attention, encourage discussions, and stimulate new EC research ideas for solving EOP and related real-world applications more efficiently
    • …
    corecore