117 research outputs found

    Fuzzy and tile coding approximation techniques for coevolution in reinforcement learning

    Get PDF
    PhDThis thesis investigates reinforcement learning algorithms suitable for learning in large state space problems and coevolution. In order to learn in large state spaces, the state space must be collapsed to a computationally feasible size and then generalised about. This thesis presents two new implementations of the classic temporal difference (TD) reinforcement learning algorithm Sarsa that utilise fuzzy logic principles for approximation, FQ Sarsa and Fuzzy Sarsa. The effectiveness of these two fuzzy reinforcement learning algorithms is investigated in the context of an agent marketplace. It presents a practical investigation into the design of fuzzy membership functions and tile coding schemas. A critical analysis of the fuzzy algorithms to a related technique in function approximation, a coarse coding approach called tile coding is given in the context of three different simulation environments; the mountain-car problem, a predator/prey gridworld and an agent marketplace. A further comparison between Fuzzy Sarsa and tile coding in the context of the nonstationary environments of the agent marketplace and predator/prey gridworld is presented. This thesis shows that the Fuzzy Sarsa algorithm achieves a significant reduction of state space over traditional Sarsa, without loss of the finer detail that the FQ Sarsa algorithm experiences. It also shows that Fuzzy Sarsa and gradient descent Sarsa(λ) with tile coding learn similar levels of distinction against a stationary strategy. Finally, this thesis demonstrates that Fuzzy Sarsa performs better in a competitive multiagent domain than the tile coding solution

    Hybrid approaches for mobile robot navigation

    Get PDF
    The work described in this thesis contributes to the efficient solution of mobile robot navigation problems. A series of new evolutionary approaches is presented. Two novel evolutionary planners have been developed that reduce the computational overhead in generating plans of mobile robot movements. In comparison with the best-performing evolutionary scheme reported in the literature, the first of the planners significantly reduces the plan calculation time in static environments. The second planner was able to generate avoidance strategies in response to unexpected events arising from the presence of moving obstacles. To overcome limitations in responsiveness and the unrealistic assumptions regarding a priori knowledge that are inherent in planner-based and a vigation systems, subsequent work concentrated on hybrid approaches. These included a reactive component to identify rapidly and autonomously environmental features that were represented by a small number of critical waypoints. Not only is memory usage dramatically reduced by such a simplified representation, but also the calculation time to determine new plans is significantly reduced. Further significant enhancements of this work were firstly, dynamic avoidance to limit the likelihood of potential collisions with moving obstacles and secondly, exploration to identify statistically the dynamic characteristics of the environment. Finally, by retaining more extensive environmental knowledge gained during previous navigation activities, the capability of the hybrid navigation system was enhanced to allow planning to be performed for any start point and goal point

    Self-adaptive fitness in evolutionary processes

    Get PDF
    Most optimization algorithms or methods in artificial intelligence can be regarded as evolutionary processes. They start from (basically) random guesses and produce increasingly better results with respect to a given target function, which is defined by the process's designer. The value of the achieved results is communicated to the evolutionary process via a fitness function that is usually somewhat correlated with the target function but does not need to be exactly the same. When the values of the fitness function change purely for reasons intrinsic to the evolutionary process, i.e., even though the externally motivated goals (as represented by the target function) remain constant, we call that phenomenon self-adaptive fitness. We trace the phenomenon of self-adaptive fitness back to emergent goals in artificial chemistry systems, for which we develop a new variant based on neural networks. We perform an in-depth analysis of diversity-aware evolutionary algorithms as a prime example of how to effectively integrate self-adaptive fitness into evolutionary processes. We sketch the concept of productive fitness as a new tool to reason about the intrinsic goals of evolution. We introduce the pattern of scenario co-evolution, which we apply to a reinforcement learning agent competing against an evolutionary algorithm to improve performance and generate hard test cases and which we also consider as a more general pattern for software engineering based on a solid formal framework. Multiple connections to related topics in natural computing, quantum computing and artificial intelligence are discovered and may shape future research in the combined fields.Die meisten Optimierungsalgorithmen und die meisten Verfahren in Bereich künstlicher Intelligenz können als evolutionäre Prozesse aufgefasst werden. Diese beginnen mit (prinzipiell) zufällig geratenen Lösungskandidaten und erzeugen dann immer weiter verbesserte Ergebnisse für gegebene Zielfunktion, die der Designer des gesamten Prozesses definiert hat. Der Wert der erreichten Ergebnisse wird dem evolutionären Prozess durch eine Fitnessfunktion mitgeteilt, die normalerweise in gewissem Rahmen mit der Zielfunktion korreliert ist, aber auch nicht notwendigerweise mit dieser identisch sein muss. Wenn die Werte der Fitnessfunktion sich allein aus für den evolutionären Prozess intrinsischen Gründen ändern, d.h. auch dann, wenn die extern motivierten Ziele (repräsentiert durch die Zielfunktion) konstant bleiben, nennen wir dieses Phänomen selbst-adaptive Fitness. Wir verfolgen das Phänomen der selbst-adaptiven Fitness zurück bis zu künstlichen Chemiesystemen (artificial chemistry systems), für die wir eine neue Variante auf Basis neuronaler Netze entwickeln. Wir führen eine tiefgreifende Analyse diversitätsbewusster evolutionärer Algorithmen durch, welche wir als Paradebeispiel für die effektive Integration von selbst-adaptiver Fitness in evolutionäre Prozesse betrachten. Wir skizzieren das Konzept der produktiven Fitness als ein neues Werkzeug zur Untersuchung von intrinsischen Zielen der Evolution. Wir führen das Muster der Szenarien-Ko-Evolution (scenario co-evolution) ein und wenden es auf einen Agenten an, der mittels verstärkendem Lernen (reinforcement learning) mit einem evolutionären Algorithmus darum wetteifert, seine Leistung zu erhöhen bzw. härtere Testszenarien zu finden. Wir erkennen dieses Muster auch in einem generelleren Kontext als formale Methode in der Softwareentwicklung. Wir entdecken mehrere Verbindungen der besprochenen Phänomene zu Forschungsgebieten wie natural computing, quantum computing oder künstlicher Intelligenz, welche die zukünftige Forschung in den kombinierten Forschungsgebieten prägen könnten

    Evolutionary Computation 2020

    Get PDF
    Intelligent optimization is based on the mechanism of computational intelligence to refine a suitable feature model, design an effective optimization algorithm, and then to obtain an optimal or satisfactory solution to a complex problem. Intelligent algorithms are key tools to ensure global optimization quality, fast optimization efficiency and robust optimization performance. Intelligent optimization algorithms have been studied by many researchers, leading to improvements in the performance of algorithms such as the evolutionary algorithm, whale optimization algorithm, differential evolution algorithm, and particle swarm optimization. Studies in this arena have also resulted in breakthroughs in solving complex problems including the green shop scheduling problem, the severe nonlinear problem in one-dimensional geodesic electromagnetic inversion, error and bug finding problem in software, the 0-1 backpack problem, traveler problem, and logistics distribution center siting problem. The editors are confident that this book can open a new avenue for further improvement and discoveries in the area of intelligent algorithms. The book is a valuable resource for researchers interested in understanding the principles and design of intelligent algorithms

    Evolution of Control Programs for a Swarm of Autonomous Unmanned Aerial Vehicles

    Get PDF
    Unmanned aerial vehicles (UAVs) are rapidly becoming a critical military asset. In the future, advances in miniaturization are going to drive the development of insect size UAVs. New approaches to controlling these swarms are required. The goal of this research is to develop a controller to direct a swarm of UAVs in accomplishing a given mission. While previous efforts have largely been limited to a two-dimensional model, a three-dimensional model has been developed for this project. Models of UAV capabilities including sensors, actuators and communications are presented. Genetic programming uses the principles of Darwinian evolution to generate computer programs to solve problems. A genetic programming approach is used to evolve control programs for UAV swarms. Evolved controllers are compared with a hand-crafted solution using quantitative and qualitative methods. Visualization and statistical methods are used to analyze solutions. Results indicate that genetic programming is capable of producing effective solutions to multi-objective control problems

    Using MapReduce Streaming for Distributed Life Simulation on the Cloud

    Get PDF
    Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp

    Perspectives on adaptive dynamical systems

    Get PDF
    Adaptivity is a dynamical feature that is omnipresent in nature, socio-economics, and technology. For example, adaptive couplings appear in various real-world systems like the power grid, social, and neural networks, and they form the backbone of closed-loop control strategies and machine learning algorithms. In this article, we provide an interdisciplinary perspective on adaptive systems. We reflect on the notion and terminology of adaptivity in different disciplines and discuss which role adaptivity plays for various fields. We highlight common open challenges, and give perspectives on future research directions, looking to inspire interdisciplinary approaches.Comment: 46 pages, 9 figure

    Learning shepherding behavior

    Get PDF
    Roboter, die Schafe hüten sowie die dazu nötigen Strategien zum Bewegen von Individuen zu einem Ziel, bieten vielseitige Anwendungen wie z. B. die Rettung von Menschen aus bedrohlichen Lagen oder der Einsatz schwimmender Roboter zur Beseitigung von Ölteppichen. In dieser Arbeit nutzen wir ein Multiagentensystem als Modell der Roboter und Schafe. Wir untersuchen die Komplexität des Schafehütens und zeigen einen Greedy-Algorithmus, der in linearer Laufzeit eine fast optimale Lösung berechnet. Weiterhin analysieren wir, wie solche Strategien gelernt werden können, da maschinelles Lernen oftmals vorteilhafte Lösungen findet. Im Folgenden nutzen wir Reinforcement Learning (RL) als Lernmethode. Damit RL Agenten ihr gelerntes Wissen auch in kontinuierlichen oder sehr großen Zustandsräumen (wie im betrachteten Szenario) vorhalten können, sind Methoden zur Wissensabstraktion nötig. Unsere Methoden kombinieren RL mit adaptiven neuronalen Verfahren und erlauben dem Agenten gleichzeitig Strategien sowie Darstellungen dieses Wissens zu lernen. Beide Verfahren basieren auf dem unüberwachten Lernverfahren Growing Neural Gas, das eine Vektorquantisierung lernt, indem es neuronale Einheiten im Eingaberaums platziert und bewegt. GNG-Q gruppiert benachbarte Zustände die gleiches Verhalten erfordern (Zustandsraumapproximation); I-GNG-Q wiederum kombiniert Wissen, um eine glatte Bewertungsfunktion zu erhalten (Approximation der Bewertungsfunktion des RL-Agenten). Beide Verfahren beobachten das Verhalten des Lerners um Stellen der Approximation zu finden, die noch verfeinert werden müssen. Die Hauptvorteile unserer Verfahren sind u.a., dass sie ohne Kenntnis des Modells der Umgebung automatisch eine passende Auflösung der Approximation bestimmen. Die experimentelle Analyse unterstreicht, dass unsere Methoden sehr effiziente und effektive Strategien erzeugen.Artificial shepherding strategies, i.e. using robots to move individuals to given locations, have many applications. For example, people can be guided by mobile robots from dangerous places or swimming robots may help to clean up oil spills. This thesis uses a multiagent system to model the robots and sheep. We analyze the complexity of the shepherding task and present a greedy algorithm that only needs linear time to compute a solution that is proven to be close to optimal. Additionally, we analyze to what extend such strategies can be learned as learning usually provides powerful solutions. This thesis focuses on reinforcement learning (RL) as learning method. To enable RL agents to use their knowledge more efficiently in continuous or large state spaces (as e.g. in the shepherding task), methods to transfer knowledge to unseen but similar situations are required. The approaches developed in this thesis, GNG-Q and I-GNG-Q, combine RL with adaptive neural algorithms and enable the agent to learn behavior in parallel with its representation. Both are based upon the growing neural gas, which is an unsupervised learning approach that learns a vector quantization by placing and adjusting units in the input space. GNG-Q groups states that are spatial close and share the same behavior while I-GNG-Q combines the learned behavior from a larger area of the approximation which results in smoother value functions. Thus, GNG-Q performs a state-space abstraction and I-GNG-Q approximates the value function. Both methods monitor the agent's policy during learning to find regions of the approximation that have to be refined. Amongst many others, the core advantages of our approaches are that they do not need the model of the environment and that the resolution of the approximation is determined automatically. The experimental evaluation underlines that the behaviors learned using our approaches are highly efficient and effective.Michael BaumannTag der Verteidigung: 22.01.2016Fakultät für Elektrotechnik, Informatik und Mathematik, Universität Paderborn, Univ., Dissertation, 201

    BNAIC 2008:Proceedings of BNAIC 2008, the twentieth Belgian-Dutch Artificial Intelligence Conference

    Get PDF
    • …
    corecore