5,096 research outputs found

    Reinforcement Learning using Augmented Neural Networks

    Full text link
    Neural networks allow Q-learning reinforcement learning agents such as deep Q-networks (DQN) to approximate complex mappings from state spaces to value functions. However, this also brings drawbacks when compared to other function approximators such as tile coding or their generalisations, radial basis functions (RBF) because they introduce instability due to the side effect of globalised updates present in neural networks. This instability does not even vanish in neural networks that do not have any hidden layers. In this paper, we show that simple modifications to the structure of the neural network can improve stability of DQN learning when a multi-layer perceptron is used for function approximation.Comment: 7 pages; two columns; 4 figure

    Evolutionary model type selection for global surrogate modeling

    Get PDF
    Due to the scale and computational complexity of currently used simulation codes, global surrogate (metamodels) models have become indispensable tools for exploring and understanding the design space. Due to their compact formulation they are cheap to evaluate and thus readily facilitate visualization, design space exploration, rapid prototyping, and sensitivity analysis. They can also be used as accurate building blocks in design packages or larger simulation environments. Consequently, there is great interest in techniques that facilitate the construction of such approximation models while minimizing the computational cost and maximizing model accuracy. Many surrogate model types exist ( Support Vector Machines, Kriging, Neural Networks, etc.) but no type is optimal in all circumstances. Nor is there any hard theory available that can help make this choice. In this paper we present an automatic approach to the model type selection problem. We describe an adaptive global surrogate modeling environment with adaptive sampling, driven by speciated evolution. Different model types are evolved cooperatively using a Genetic Algorithm ( heterogeneous evolution) and compete to approximate the iteratively selected data. In this way the optimal model type and complexity for a given data set or simulation code can be dynamically determined. Its utility and performance is demonstrated on a number of problems where it outperforms traditional sequential execution of each model type

    Design of Predictive Controllers by Dynamic Programming and Neural Networks

    Get PDF
    This paper proposes a method for the design of predictive controllers for nonlinear systems. The method consists of two phases, a solution phase and a learning phase. In the solution phase, dynamic programming is applied to obtain a closed-loop control law. In the learning phase, neural networks are used to simulate the control law. This phase overcomes the curse of dimensionality problem that has often hindered the implementation of control laws generated by dynamic programming. Experimental results demonstrate the effectiveness of the metho

    Curses, Tradeoffs, and Scalable Management:Advancing Evolutionary Multiobjective Direct Policy Search to Improve Water Reservoir Operations

    Get PDF
    Optimal management policies for water reservoir operation are generally designed via stochastic dynamic programming (SDP). Yet, the adoption of SDP in complex real-world problems is challenged by the three curses of dimensionality, modeling, and multiple objectives. These three curses considerably limit SDP’s practical application. Alternatively, this study focuses on the use of evolutionary multiobjective direct policy search (EMODPS), a simulation-based optimization approach that combines direct policy search, nonlinear approximating networks, and multiobjective evolutionary algorithms to design Pareto-approximate closed-loop operating policies for multipurpose water reservoirs. This analysis explores the technical and practical implications of using EMODPS through a careful diagnostic assessment of the effectiveness and reliability of the overall EMODPS solution design as well as of the resulting Pareto-approximate operating policies. The EMODPS approach is evaluated using the multipurpose Hoa Binh water reservoir in Vietnam, where water operators are seeking to balance the conflicting objectives of maximizing hydropower production and minimizing flood risks. A key choice in the EMODPS approach is the selection of alternative formulations for flexibly representing reservoir operating policies. This study distinguishes between the relative performance of two widely-used nonlinear approximating networks, namely artificial neural networks (ANNs) and radial basis functions (RBFs). The results show that RBF solutions are more effective than ANN ones in designing Pareto approximate policies for the Hoa Binh reservoir. Given the approximate nature of EMODPS, the diagnostic benchmarking uses SDP to evaluate the overall quality of the attained Pareto-approximate results. Although the Hoa Binh test case’s relative simplicity should maximize the potential value of SDP, the results demonstrate that EMODPS successfully dominates the solutions derived via SDP
    • …
    corecore