1,624 research outputs found

    Adaptive dynamic programming with eligibility traces and complexity reduction of high-dimensional systems

    Get PDF
    This dissertation investigates the application of a variety of computational intelligence techniques, particularly clustering and adaptive dynamic programming (ADP) designs especially heuristic dynamic programming (HDP) and dual heuristic programming (DHP). Moreover, a one-step temporal-difference (TD(0)) and n-step TD (TD(λ)) with their gradients are utilized as learning algorithms to train and online-adapt the families of ADP. The dissertation is organized into seven papers. The first paper demonstrates the robustness of model order reduction (MOR) for simulating complex dynamical systems. Agglomerative hierarchical clustering based on performance evaluation is introduced for MOR. This method computes the reduced order denominator of the transfer function by clustering system poles in a hierarchical dendrogram. Several numerical examples of reducing techniques are taken from the literature to compare with our work. In the second paper, a HDP is combined with the Dyna algorithm for path planning. The third paper uses DHP with an eligibility trace parameter (λ) to track a reference trajectory under uncertainties for a nonholonomic mobile robot by using a first-order Sugeno fuzzy neural network structure for the critic and actor networks. In the fourth and fifth papers, a stability analysis for a model-free action-dependent HDP(λ) is demonstrated with batch- and online-implementation learning, respectively. The sixth work combines two different gradient prediction levels of critic networks. In this work, we provide a convergence proofs. The seventh paper develops a two-hybrid recurrent fuzzy neural network structures for both critic and actor networks. They use a novel n-step gradient temporal-difference (gradient of TD(λ)) of an advanced ADP algorithm called value-gradient learning (VGL(λ)), and convergence proofs are given. Furthermore, the seventh paper is the first to combine the single network adaptive critic with VGL(λ). --Abstract, page iv

    Self-Learning Hot Data Prediction: Where Echo State Network Meets NAND Flash Memories

    Get PDF
    © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Well understanding the access behavior of hot data is significant for NAND flash memory due to its crucial impact on the efficiency of garbage collection (GC) and wear leveling (WL), which respectively dominate the performance and life span of SSD. Generally, both GC and WL rely greatly on the recognition accuracy of hot data identification (HDI). However, in this paper, the first time we propose a novel concept of hot data prediction (HDP), where the conventional HDI becomes unnecessary. First, we develop a hybrid optimized echo state network (HOESN), where sufficiently unbiased and continuously shrunk output weights are learnt by a sparse regression based on L2 and L1/2 regularization. Second, quantum-behaved particle swarm optimization (QPSO) is employed to compute reservoir parameters (i.e., global scaling factor, reservoir size, scaling coefficient and sparsity degree) for further improving prediction accuracy and reliability. Third, in the test on a chaotic benchmark (Rossler), the HOESN performs better than those of six recent state-of-the-art methods. Finally, simulation results about six typical metrics tested on five real disk workloads and on-chip experiment outcomes verified from an actual SSD prototype indicate that our HOESN-based HDP can reliably promote the access performance and endurance of NAND flash memories.Peer reviewe

    The Importance of Clipping in Neurocontrol by Direct Gradient Descent on the Cost-to-Go Function and in Adaptive Dynamic Programming

    Full text link
    In adaptive dynamic programming, neurocontrol and reinforcement learning, the objective is for an agent to learn to choose actions so as to minimise a total cost function. In this paper we show that when discretized time is used to model the motion of the agent, it can be very important to do "clipping" on the motion of the agent in the final time step of the trajectory. By clipping we mean that the final time step of the trajectory is to be truncated such that the agent stops exactly at the first terminal state reached, and no distance further. We demonstrate that when clipping is omitted, learning performance can fail to reach the optimum; and when clipping is done properly, learning performance can improve significantly. The clipping problem we describe affects algorithms which use explicit derivatives of the model functions of the environment to calculate a learning gradient. These include Backpropagation Through Time for Control, and methods based on Dual Heuristic Dynamic Programming. However the clipping problem does not significantly affect methods based on Heuristic Dynamic Programming, Temporal Differences or Policy Gradient Learning algorithms. Similarly, the clipping problem does not affect fixed-length finite-horizon problems

    Modelling the evaporation of thin films of colloidal suspensions using Dynamical Density Functional Theory

    Get PDF
    Recent experiments have shown that various structures may be formed during the evaporative dewetting of thin films of colloidal suspensions. Nano-particle deposits of strongly branched `flower-like', labyrinthine and network structures are observed. They are caused by the different transport processes and the rich phase behaviour of the system. We develop a model for the system, based on a dynamical density functional theory, which reproduces these structures. The model is employed to determine the influences of the solvent evaporation and of the diffusion of the colloidal particles and of the liquid over the surface. Finally, we investigate the conditions needed for `liquid-particle' phase separation to occur and discuss its effect on the self-organised nano-structures

    Advanced control techniques for modern inertia based inverters

    Get PDF
    ”In this research three artificial intelligent (AI)-based techniques are proposed to regulate the voltage and frequency of a grid-connected inverter. The increase in the penetration of renewable energy sources (RESs) into the power grid has led to the increase in the penetration of fast-responding inertia-less power converters. The increase in the penetration of these power electronics converters changes the nature of the conventional grid, in which the existing kinetic inertia in the rotating parts of the enormous generators plays a vital role. The concept of virtual inertia control scheme is proposed to make the behavior of grid connected inverters more similar to the synchronous generators, by mimicking the mechanical behavior of a synchronous generator. Conventional control techniques lack to perform optimally in nonlinear, uncertain, inaccurate power grids. Besides, the decoupled control assumption in conventional VSGs makes them nonoptimal in resistive grids. The neural network predictive controller, the heuristic dynamic programming, and the dual heuristic dynamic programming techniques are presented in this research to overcome the draw backs of conventional VSGs. The nonlinear characteristics of neural networks, and the online training enable the proposed methods to perform as robust and optimal controllers. The simulation and the experimental laboratory prototype results are provided to demonstrate the effectiveness of the proposed techniques”--Abstract, page iv

    A Heuristic Dynamic Programming Based Power System Stabilizer for a Turbogenerator in a Single Machine Power System

    Get PDF
    Power system stabilizers (PSS) are used to generate supplementary control signals for the excitation system in order to damp the low frequency power system oscillations. To overcome the drawbacks of conventional PSS (CPSS), numerous techniques have been proposed in the literature. Based on the analysis of existing techniques, a novel design of power system stabilizer (PSS) based on heuristic dynamic programming (HDP) is proposed in this paper. HDP combining the concepts of dynamic programming and reinforcement learning is used in the design of a nonlinear optimal power system stabilizer. The proposed HDP based PSS is evaluated against the conventional power system stabilizer and indirect adaptive neurocontrol based PSS under small and large disturbances in a single machine infinite bus power system setup. Results are presented to show the effectiveness of this new technique

    A Heuristic-Dynamic-programming-Based Power System Stabilizer for a Turbogenerator in a Single-Machine Power System

    Get PDF
    Power system stabilizers (PSSs) are used to generate supplementary control signals for the excitation system in order to damp the low-frequency power system oscillations. To overcome the drawbacks of a conventional PSS (CPSS), numerous techniques have been proposed in the literature. Based on the analysis of existing techniques, a novel design based on heuristic dynamic programming (HDP) is presented in this paper. HDP, combining the concepts of dynamic programming and reinforcement learning, is used in the design of a nonlinear optimal power system stabilizer. Results show the effectiveness of this new technique. The performance of the HDP-based PSS is compared with the CPSS and the indirect-adaptive-neurocontrol-based PSS under small and large disturbances. In addition, the impact of different discount factors in the HDP PSS\u27s performance is presented

    Optimal Dynamic Neurocontrol of a Gate-Controlled Series Capacitor in a Multi-Machine Power System

    Get PDF
    This paper presents the design of an optimal dynamic neurocontroller for a new type of FACTS device - the gate controlled series capacitor (GCSC) incorporated in a multi-machine power system. The optimal neurocontroller is developed based on the heuristic dynamic programming (HDP) approach. In addition, a dynamic identifier/model and controller structure using the recurrent neural network trained with backpropagation through time (BPTT) is employed. Simulation results are presented to show the effectiveness of the dynamic neurocontroller and its performance is compared with that of the conventional PI controller under small and large disturbances
    corecore