Search CORE

40,648 research outputs found

Reinforcement Learning: A Survey

Author: Kaelbling L. P.
Littman M. L.
Moore A. W.
Publication venue
Publication date: 01/01/1996
Field of study

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

Modelling and control of chaotic processes through their Bifurcation Diagrams generated with the help of Recurrent Neural Network models: Part 1—simulation studies

Author: C S Kumar
Faruqi M Aslam
Jallu Krishnaiah
Publication venue
Publication date: 01/01/2006
Field of study

Many real-world processes tend to be chaotic and also do not lead to satisfactory analytical modelling. It has been shown here that for such chaotic processes represented through short chaotic noisy time-series, a multi-input and multi-output recurrent neural networks model can be built which is capable of capturing the process trends and predicting the future values from any given starting condition. It is further shown that this capability can be achieved by the Recurrent Neural Network model when it is trained to very low value of mean squared error. Such a model can then be used for constructing the Bifurcation Diagram of the process leading to determination of desirable operating conditions. Further, this multi-input and multi-output model makes the process accessible for control using open-loop/closed-loop approaches or bifurcation control etc. All these studies have been carried out using a low dimensional discrete chaotic system of Hénon Map as a representative of some real-world processes

CogPrints Cognitive Sciences Eprint Archive

Feedback control by online learning an inverse model

Author: Schrauwen Benjamin
Waegeman Tim
wyffels Francis
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

A model, predictor, or error estimator is often used by a feedback controller to control a plant. Creating such a model is difficult when the plant exhibits nonlinear behavior. In this paper, a novel online learning control framework is proposed that does not require explicit knowledge about the plant. This framework uses two learning modules, one for creating an inverse model, and the other for actually controlling the plant. Except for their inputs, they are identical. The inverse model learns by the exploration performed by the not yet fully trained controller, while the actual controller is based on the currently learned model. The proposed framework allows fast online learning of an accurate controller. The controller can be applied on a broad range of tasks with different dynamic characteristics. We validate this claim by applying our control framework on several control tasks: 1) the heating tank problem (slow nonlinear dynamics); 2) flight pitch control (slow linear dynamics); and 3) the balancing problem of a double inverted pendulum (fast linear and nonlinear dynamics). The results of these experiments show that fast learning and accurate control can be achieved. Furthermore, a comparison is made with some classical control approaches, and observations concerning convergence and stability are made

Ghent University Academic Bibliography

Exponential stability of delayed recurrent neural networks with Markovian jumping parameters

Author: Bolle
Boyd
Cao
Casey
Cleeremans
Elman
Gahinet
Gao
Gao
Gao
Huang
Ji
Kovacic
Li Yu
Liang
Skorohod
Tino
Wang
Wang
Xiaohui Liu
Yurong Liu
Zidong Wang
Publication venue: 'Elsevier BV'
Publication date: 01/08/2006
Field of study

This is the post print version of the article. The official published version can be obtained from the link below - Copyright 2006 Elsevier Ltd.In this Letter, the global exponential stability analysis problem is considered for a class of recurrent neural networks (RNNs) with time delays and Markovian jumping parameters. The jumping parameters considered here are generated from a continuous-time discrete-state homogeneous Markov process, which are governed by a Markov process with discrete and finite state space. The purpose of the problem addressed is to derive some easy-to-test conditions such that the dynamics of the neural network is stochastically exponentially stable in the mean square, independent of the time delay. By employing a new Lyapunov–Krasovskii functional, a linear matrix inequality (LMI) approach is developed to establish the desired sufficient conditions, and therefore the global exponential stability in the mean square for the delayed RNNs can be easily checked by utilizing the numerically efficient Matlab LMI toolbox, and no tuning of parameters is required. A numerical example is exploited to show the usefulness of the derived LMI-based stability conditions.This work was supported in part by the Engineering and Physical Sciences Research Council (EPSRC) of the UK under Grant GR/S27658/01, the Nuffield Foundation of the UK under Grant NAL/00630/G, and the Alexander von Humboldt Foundation of Germany

Crossref

Brunel University Research Archive

Delayed Dynamical Systems: Networks, Chimeras and Reservoir Computing

Author: Hart Joseph D.
Larger Laurent
Murphy Thomas E.
Roy Rajarshi
Publication venue: 'The Royal Society'
Publication date: 14/08/2018
Field of study

We present a systematic approach to reveal the correspondence between time delay dynamics and networks of coupled oscillators. After early demonstrations of the usefulness of spatio-temporal representations of time-delay system dynamics, extensive research on optoelectronic feedback loops has revealed their immense potential for realizing complex system dynamics such as chimeras in rings of coupled oscillators and applications to reservoir computing. Delayed dynamical systems have been enriched in recent years through the application of digital signal processing techniques. Very recently, we have showed that one can significantly extend the capabilities and implement networks with arbitrary topologies through the use of field programmable gate arrays (FPGAs). This architecture allows the design of appropriate filters and multiple time delays which greatly extend the possibilities for exploring synchronization patterns in arbitrary topological networks. This has enabled us to explore complex dynamics on networks with nodes that can be perfectly identical, introduce parameter heterogeneities and multiple time delays, as well as change network topologies to control the formation and evolution of patterns of synchrony

arXiv.org e-Print Archive

Differential Dynamic Programming for time-delayed systems

Author: Fan David D.
Theodorou Evangelos A.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/01/2017
Field of study

Trajectory optimization considers the problem of deciding how to control a dynamical system to move along a trajectory which minimizes some cost function. Differential Dynamic Programming (DDP) is an optimal control method which utilizes a second-order approximation of the problem to find the control. It is fast enough to allow real-time control and has been shown to work well for trajectory optimization in robotic systems. Here we extend classic DDP to systems with multiple time-delays in the state. Being able to find optimal trajectories for time-delayed systems with DDP opens up the possibility to use richer models for system identification and control, including recurrent neural networks with multiple timesteps in the state. We demonstrate the algorithm on a two-tank continuous stirred tank reactor. We also demonstrate the algorithm on a recurrent neural network trained to model an inverted pendulum with position information only.Comment: 7 pages, 6 figures, conference, Decision and Control (CDC), 2016 IEEE 55th Conference o

arXiv.org e-Print Archive

Crossref