1,696 research outputs found

    Adaptive dynamic programming with eligibility traces and complexity reduction of high-dimensional systems

    Get PDF
    This dissertation investigates the application of a variety of computational intelligence techniques, particularly clustering and adaptive dynamic programming (ADP) designs especially heuristic dynamic programming (HDP) and dual heuristic programming (DHP). Moreover, a one-step temporal-difference (TD(0)) and n-step TD (TD(λ)) with their gradients are utilized as learning algorithms to train and online-adapt the families of ADP. The dissertation is organized into seven papers. The first paper demonstrates the robustness of model order reduction (MOR) for simulating complex dynamical systems. Agglomerative hierarchical clustering based on performance evaluation is introduced for MOR. This method computes the reduced order denominator of the transfer function by clustering system poles in a hierarchical dendrogram. Several numerical examples of reducing techniques are taken from the literature to compare with our work. In the second paper, a HDP is combined with the Dyna algorithm for path planning. The third paper uses DHP with an eligibility trace parameter (λ) to track a reference trajectory under uncertainties for a nonholonomic mobile robot by using a first-order Sugeno fuzzy neural network structure for the critic and actor networks. In the fourth and fifth papers, a stability analysis for a model-free action-dependent HDP(λ) is demonstrated with batch- and online-implementation learning, respectively. The sixth work combines two different gradient prediction levels of critic networks. In this work, we provide a convergence proofs. The seventh paper develops a two-hybrid recurrent fuzzy neural network structures for both critic and actor networks. They use a novel n-step gradient temporal-difference (gradient of TD(λ)) of an advanced ADP algorithm called value-gradient learning (VGL(λ)), and convergence proofs are given. Furthermore, the seventh paper is the first to combine the single network adaptive critic with VGL(λ). --Abstract, page iv

    Ball and Beam Control using Adaptive PID based on Q-Learning

    Get PDF
    The ball and beam system is one of the most used systems for benchmarking the controller response because it has nonlinear and unstable characteristics. Furthermore, in line with the increasing of computation power availability and artificial intelligence research intensity, especially the reinforcement learning field, nowadays plenty of researchers are working on a learning control approach for controlling systems. Due to that, in this paper, the adaptive PID controller based on Q-Learning (Q-PID) was used to control the ball position on the ball and beam system. From the simulation result, Q-PID outperforms the conventional PID and heuristic PID controller technique with the swifter settling time and lower overshoot percentage

    Reinforcement Learning Algorithms in Humanoid Robotics

    Get PDF

    Creating Virtual Animals Through Machine Learning

    Get PDF
    Approximately 42 percent of threatened or endangered species are at risk due to invasive species. Some invasive species find the new habitat by themselves during migrations, others are misplaced by humans, be it by mistake or necessity. This project aims to create a virtual habitat, populated by intelligent agents that represent the animals present in it. Programmers and scientists can add invasive species, and simulate what might happen. This will allow a more proactive response to this type of crisis. Different data-driven models are explored in order to find the best one for the problem at hands. Game engines are discussed, they have improved greatly over the last decade, and are accessible to everyone. Reliable tools to build simple or complex prototypes that give us graphic representations that can be photo realisticAproximadamente 42 por cento das espécies em vias de extinção estão em risco devido a espécies invasoras. Algumas dessas espécies invasoras chegam aos novos habitats através de migrações, outras chegam através da mão humana, voluntaria ou involuntariamente. Este projeto tem como objetivo criar um habitat virtual, com agentes inteligentes que rep resentam os animais presentes nesse mesmo habitat. Programadores e cientistas poderão adicionar espécies invasoras, e simular o que pode acontecer. Isto irá permitir uma resposta mais proativa quando estes tipos de crises acontecem. Diferentes modelos orientados a dados são explorados, a fim de perceber qual será o melhor para resolver o problema. Game engines são discutidos, este tipo de ferramenta tem evoluído bastante ao longo da última década, são ferramentas grátis, que podem ser usadas para criar protótipos com gráficos simples, ou foto realista

    Adaptive Critic Design Based Neuro-Fuzzy Controller for a Static Compensator in a Multimachine Power System

    Get PDF
    This paper presents a novel nonlinear optimal controller for a static compensator (STATCOM) connected to a power system, using artificial neural networks and fuzzy logic. The action dependent heuristic dynamic programming, a member of the adaptive Critic designs family, is used for the design of the STATCOM neuro-fuzzy controller. This neuro-fuzzy controller provides optimal control based on reinforcement learning and approximate dynamic programming. Using a proportional-integrator approach the proposed controller is capable of dealing with actual rather than deviation signals. The STATCOM is connected to a multimachine power system. Two multimachine systems are considered in this study: a 10-bus system and a 45-bus network (a section of the Brazilian power system). Simulation results are provided to show that the proposed controller outperforms a conventional PI controller in large scale faults as well as small disturbance

    Learning Agent for a Heat-Pump Thermostat With a Set-Back Strategy Using Model-Free Reinforcement Learning

    Full text link
    The conventional control paradigm for a heat pump with a less efficient auxiliary heating element is to keep its temperature set point constant during the day. This constant temperature set point ensures that the heat pump operates in its more efficient heat-pump mode and minimizes the risk of activating the less efficient auxiliary heating element. As an alternative to a constant set-point strategy, this paper proposes a learning agent for a thermostat with a set-back strategy. This set-back strategy relaxes the set-point temperature during convenient moments, e.g. when the occupants are not at home. Finding an optimal set-back strategy requires solving a sequential decision-making process under uncertainty, which presents two challenges. A first challenge is that for most residential buildings a description of the thermal characteristics of the building is unavailable and challenging to obtain. A second challenge is that the relevant information on the state, i.e. the building envelope, cannot be measured by the learning agent. In order to overcome these two challenges, our paper proposes an auto-encoder coupled with a batch reinforcement learning technique. The proposed approach is validated for two building types with different thermal characteristics for heating in the winter and cooling in the summer. The simulation results indicate that the proposed learning agent can reduce the energy consumption by 4-9% during 100 winter days and by 9-11% during 80 summer days compared to the conventional constant set-point strategyComment: Submitted to Energies - MDPI.co

    Advances in Reinforcement Learning

    Get PDF
    Reinforcement Learning (RL) is a very dynamic area in terms of theory and application. This book brings together many different aspects of the current research on several fields associated to RL which has been growing rapidly, producing a wide variety of learning algorithms for different applications. Based on 24 Chapters, it covers a very broad variety of topics in RL and their application in autonomous systems. A set of chapters in this book provide a general overview of RL while other chapters focus mostly on the applications of RL paradigms: Game Theory, Multi-Agent Theory, Robotic, Networking Technologies, Vehicular Navigation, Medicine and Industrial Logistic

    Optimal Neuro-Fuzzy External Controller for a STATCOM in the 12-Bus Benchmark Power System

    Get PDF
    An optimal neuro-fuzzy external controller is designed in this paper for a static compensator (STATCOM) in the 12-bus benchmark power system. The controller provides an auxiliary reference signal for the STATCOM in such a way that it improves the damping of the rotor speed deviations of its neighboring generators. A Mamdani fuzzy rule base constitutes the core of the controller. A heuristic dynamic programming-based approach is used to further train the controller and enable it to provide nonlinear optimal control at different operating conditions of the power system. Simulation results are provided that indicate the proposed neuro-fuzzy external controller is more effective than a linear external controller for damping out the speed deviations of the generators. In addition, the two controllers are compared in terms of the control effort generated by each one during various disturbances and the proposed neuro-fuzzy controller proves to be more effective with smaller control effort

    Reinforcement Learning

    Get PDF
    Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Learning is a very important aspect. This book is on reinforcement learning which involves performing actions to achieve a goal. The first 11 chapters of this book describe and extend the scope of reinforcement learning. The remaining 11 chapters show that there is already wide usage in numerous fields. Reinforcement learning can tackle control tasks that are too complex for traditional, hand-designed, non-learning controllers. As learning computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels. This book shows that reinforcement learning is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field
    • …
    corecore