40,648 research outputs found
Reinforcement Learning: A Survey
This paper surveys the field of reinforcement learning from a
computer-science perspective. It is written to be accessible to researchers
familiar with machine learning. Both the historical basis of the field and a
broad selection of current work are summarized. Reinforcement learning is the
problem faced by an agent that learns behavior through trial-and-error
interactions with a dynamic environment. The work described here has a
resemblance to work in psychology, but differs considerably in the details and
in the use of the word ``reinforcement.'' The paper discusses central issues of
reinforcement learning, including trading off exploration and exploitation,
establishing the foundations of the field via Markov decision theory, learning
from delayed reinforcement, constructing empirical models to accelerate
learning, making use of generalization and hierarchy, and coping with hidden
state. It concludes with a survey of some implemented systems and an assessment
of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file
Modelling and control of chaotic processes through their Bifurcation Diagrams generated with the help of Recurrent Neural Network models: Part 1āsimulation studies
Many real-world processes tend to be chaotic and also do not lead to satisfactory analytical modelling. It has been shown here that for such chaotic processes represented through short chaotic noisy time-series, a multi-input and multi-output recurrent neural networks model can be built which is capable of capturing the process trends and predicting the future values from any given starting condition. It is further shown that this capability can be achieved by the Recurrent Neural Network model when it is trained to very low value of mean squared error. Such a model can then be used for constructing the Bifurcation Diagram of the process leading to determination of desirable operating conditions. Further, this multi-input and multi-output model makes the process accessible for control using open-loop/closed-loop approaches or bifurcation control etc. All these studies have been carried out using a low dimensional discrete chaotic system of HĆ©non Map as a representative of some real-world processes
Feedback control by online learning an inverse model
A model, predictor, or error estimator is often used by a feedback controller to control a plant. Creating such a model is difficult when the plant exhibits nonlinear behavior. In this paper, a novel online learning control framework is proposed that does not require explicit knowledge about the plant. This framework uses two learning modules, one for creating an inverse model, and the other for actually controlling the plant. Except for their inputs, they are identical. The inverse model learns by the exploration performed by the not yet fully trained controller, while the actual controller is based on the currently learned model. The proposed framework allows fast online learning of an accurate controller. The controller can be applied on a broad range of tasks with different dynamic characteristics. We validate this claim by applying our control framework on several control tasks: 1) the heating tank problem (slow nonlinear dynamics); 2) flight pitch control (slow linear dynamics); and 3) the balancing problem of a double inverted pendulum (fast linear and nonlinear dynamics). The results of these experiments show that fast learning and accurate control can be achieved. Furthermore, a comparison is made with some classical control approaches, and observations concerning convergence and stability are made
Exponential stability of delayed recurrent neural networks with Markovian jumping parameters
This is the post print version of the article. The official published version can be obtained from the link below - Copyright 2006 Elsevier Ltd.In this Letter, the global exponential stability analysis problem is considered for a class of recurrent neural networks (RNNs) with time delays and Markovian jumping parameters. The jumping parameters considered here are generated from a continuous-time discrete-state homogeneous Markov process, which are governed by a Markov process with discrete and finite state space. The purpose of the problem addressed is to derive some easy-to-test conditions such that the dynamics of the neural network is stochastically exponentially stable in the mean square, independent of the time delay. By employing a new LyapunovāKrasovskii functional, a linear matrix inequality (LMI) approach is developed to establish the desired sufficient conditions, and therefore the global exponential stability in the mean square for the delayed RNNs can be easily checked by utilizing the numerically efficient Matlab LMI toolbox, and no tuning of parameters is required. A numerical example is exploited to show the usefulness of the derived LMI-based stability conditions.This work was supported in part by the Engineering and Physical Sciences Research Council (EPSRC) of the UK under Grant GR/S27658/01, the Nuffield Foundation of the UK under Grant NAL/00630/G, and the Alexander von Humboldt Foundation of Germany
Delayed Dynamical Systems: Networks, Chimeras and Reservoir Computing
We present a systematic approach to reveal the correspondence between time
delay dynamics and networks of coupled oscillators. After early demonstrations
of the usefulness of spatio-temporal representations of time-delay system
dynamics, extensive research on optoelectronic feedback loops has revealed
their immense potential for realizing complex system dynamics such as chimeras
in rings of coupled oscillators and applications to reservoir computing.
Delayed dynamical systems have been enriched in recent years through the
application of digital signal processing techniques. Very recently, we have
showed that one can significantly extend the capabilities and implement
networks with arbitrary topologies through the use of field programmable gate
arrays (FPGAs). This architecture allows the design of appropriate filters and
multiple time delays which greatly extend the possibilities for exploring
synchronization patterns in arbitrary topological networks. This has enabled us
to explore complex dynamics on networks with nodes that can be perfectly
identical, introduce parameter heterogeneities and multiple time delays, as
well as change network topologies to control the formation and evolution of
patterns of synchrony
Differential Dynamic Programming for time-delayed systems
Trajectory optimization considers the problem of deciding how to control a
dynamical system to move along a trajectory which minimizes some cost function.
Differential Dynamic Programming (DDP) is an optimal control method which
utilizes a second-order approximation of the problem to find the control. It is
fast enough to allow real-time control and has been shown to work well for
trajectory optimization in robotic systems. Here we extend classic DDP to
systems with multiple time-delays in the state. Being able to find optimal
trajectories for time-delayed systems with DDP opens up the possibility to use
richer models for system identification and control, including recurrent neural
networks with multiple timesteps in the state. We demonstrate the algorithm on
a two-tank continuous stirred tank reactor. We also demonstrate the algorithm
on a recurrent neural network trained to model an inverted pendulum with
position information only.Comment: 7 pages, 6 figures, conference, Decision and Control (CDC), 2016 IEEE
55th Conference o
- ā¦