Search CORE

267 research outputs found

Scaling reinforcement learning to the unconstrained multi-agent domain

Author: Palmer Victor
Publication venue
Publication date: 02/06/2009
Field of study

Reinforcement learning is a machine learning technique designed to mimic the way animals learn by receiving rewards and punishment. It is designed to train intelligent agents when very little is known about the agent’s environment, and consequently the agent’s designer is unable to hand-craft an appropriate policy. Using reinforcement learning, the agent’s designer can merely give reward to the agent when it does something right, and the algorithm will craft an appropriate policy automatically. In many situations it is desirable to use this technique to train systems of agents (for example, to train robots to play RoboCup soccer in a coordinated fashion). Unfortunately, several significant computational issues occur when using this technique to train systems of agents. This dissertation introduces a suite of techniques that overcome many of these difficulties in various common situations. First, we show how multi-agent reinforcement learning can be made more tractable by forming coalitions out of the agents, and training each coalition separately. Coalitions are formed by using information-theoretic techniques, and we find that by using a coalition-based approach, the computational complexity of reinforcement-learning can be made linear in the total system agent count. Next we look at ways to integrate domain knowledge into the reinforcement learning process, and how this can signifi-cantly improve the policy quality in multi-agent situations. Specifically, we find that integrating domain knowledge into a reinforcement learning process can overcome training data deficiencies and allow the learner to converge to acceptable solutions when lack of training data would have prevented such convergence without domain knowledge. We then show how to train policies over continuous action spaces, which can reduce problem complexity for domains that require continuous action spaces (analog controllers) by eliminating the need to finely discretize the action space. Finally, we look at ways to perform reinforcement learning on modern GPUs and show how by doing this we can tackle significantly larger problems. We find that by offloading some of the RL computation to the GPU, we can achieve almost a 4.5 speedup factor in the total training process

Texas A&M Repository

OBSERVER-BASED-CONTROLLER FOR INVERTED PENDULUM MODEL

Author: Hafez Sarkawi
Jali Mohd Hafiz
Tarmizi Ahmad Izzuddin
Publication venue
Publication date: 02/07/2013
Field of study

This paper presents a state space control technique for inverted pendulum system. The system is a common classical control problem that has been widely used to test multiple control algorithms because of its nonlinear and unstable behavior. Full state feedback based on pole placement and optimal control is applied to the inverted pendulum system to achieve desired design specification which are 4 seconds settling time and 5% overshoot. The simulation and optimization of the full state feedback controller based on pole placement and optimal control techniques as well as the performance comparison between these techniques is described comprehensively. The comparison is made to choose the most suitable technique for the system that have the best trade-off between settling time and overshoot. Besides that, the observer design is analyzed to see the effect of pole location and noise present in the system

Universiti Teknikal Malaysia Melaka (UTeM) Repository

A Review of Resonant Converter Control Techniques and The Performances

Author: Hafez Sarkawi
Jali Mohd Hafiz
Tarmizi Ahmad Izzuddin
Publication venue
Publication date: 02/07/2013
Field of study

paper first discusses each control technique and then gives experimental results and/or performance to highlights their merits. The resonant converter used as a case study is not specified to just single topology instead it used few topologies such as series-parallel resonant converter (SPRC), LCC resonant converter and parallel resonant converter (PRC). On the other hand, the control techniques presented in this paper are self-sustained phase shift modulation (SSPSM) control, self-oscillating power factor control, magnetic control and the H-∞ robust control technique

Universiti Teknikal Malaysia Melaka (UTeM) Repository

A Review of Resonant Converter Control Techniques and The Performances

Author: Hafez Sarkawi
Jali Mohd Hafiz
Tarmizi Ahmad Izzuddin
Publication venue
Publication date: 02/07/2013
Field of study

Universiti Teknikal Malaysia Melaka (UTeM) Repository

State-Feedback Controller Based on Pole Placement Technique for Inverted Pendulum System

Author: Fauzal Naim Zohedi
Jali Mohd Hafiz
Safarwani Jaafar
Publication venue
Publication date: 02/07/2013
Field of study

This paper presents a state space control technique for inverted pendulum system using simulation and real experiment via MATLAB/SIMULINK software. The inverted pendulum is difficult system to control in the field of control engineering. It is also one of the most important classical control system problems because of its nonlinear characteristics and unstable system. It has three main problems that always appear in control application which are nonlinear system, unstable and non-minimumbehavior phase system. This project will apply state feedback controller based on pole placement technique which is capable in stabilizing the practical based inverted pendulum at vertical position. Desired design specifications which are 4 seconds settling time and 5 % overshoot is needed to apply in full state feedback controller based on pole placement technique. First of all, the mathematical model of an inverted pendulum system is derived to obtain the state space representation of the system. Then, the design phase of the State-Feedback Controller can be conducted after linearization technique is performed to the nonlinear equation with the aid of mathematical aided software such as Mathcad. After that, the design is simulated using MATLAB/Simulink software. The controller design of the inverted pendulum system is verified using simulation and experiment test. Finally the controller design is compared with PID controller for benchmarking purpose

Universiti Teknikal Malaysia Melaka (UTeM) Repository

An Approach to Tune PID Fuzzy Logic Controllers Based on Reinforcement Learning

Author: Hacene Rezine
J&#232
Louali Rabah
Pascal Maussion
Publication venue: 'IntechOpen'
Publication date: 01/10/2008
Field of study

IntechOpen

Evolutionary and Reinforcement Fuzzy Control

Author: Chowdhury Mina Munir-ul Mahmood
Publication venue: ProQuest Dissertations & Theses,
Publication date: 01/01/1999
Field of study

Many modern and classical techniques exist for the design of control systems. However, many real world applications are inherently complex and the application of traditional design and control techniques is limited. In addition, no single design method exists which can be applied to all types of system. Due to this 'deficiency', recent years have seen an exponential increase in the use of methods loosely termed 'computational intelligent techniques' or 'soft- computing techniques'. Such techniques tend to solve problems using a population of individual elements or potential solutions or the flexibility of a network as opposed to using a rigid, single point of computing. Through use of computational redundancies, soft-computing allows unmatched tractability in practical problem solving. The intelligent paradigm most successfully applied to control engineering, is that of fuzzy logic in the form of fuzzy control. The motivation of using fuzzy control is twofold. First, it allows one to incorporate heuristics into the control strategy, such as the model operator actions. Second, it allows nonlinearities to be defined in an intuitive way using rules and interpolations. Although it is an attractive tool, there still exist many problems to be solved in fuzzy control. To date most applications have been limited to relatively simple problems of low dimensionality. This is primarily due to the fact that the design process is very much a trial and error one and is heavily dependent on the quality of expert knowledge provided by the operator. In addition, fuzzy control design is virtually ad hoc, lacking a systematic design procedure. Other problems include those associated with the curse of dimensionality and the inability to learn and improve from experience. While much work has been carried out to alleviate most of these difficulties, there exists a lack of drive and exploration in the last of these points. The objective of this thesis is to develop an automated, systematic procedure for optimally learning fuzzy logic controllers (FLCs), which provides for autonomous and simple implementations. In pursuit of this goal, a hybrid method is to combine the advantages artificial neural networks (ANNs), evolutionary algorithms (EA) and reinforcement learning (RL). This overcomes the deficiencies of conventional EAs that may omit representation of the region within a variable's operating range and that do not in practice achieve fine learning. This method also allows backpropagation when necessary or feasible. It is termed an Evolutionary NeuroFuzzy Learning Intelligent Control technique (ENFLICT) model. Unlike other hybrids, ENFLICT permits globally structural learning and local offline or online learning. The global EA and local neural learning processes should not be separated. Here, the EA learns and optimises the ENFLICT structure while ENFLICT learns the network parameters. The EA used here is an improved version of a technique known as the messy genetic algorithm (mGA), which utilises flexible cellular chromosomes for structural optimisation. The properties of the mGA as compared with other flexible length EAs, are that it enables the addressing of issues such as the curse of dimensionality and redundant genetic information. Enhancements to the algorithm are in the coding and decoding of the genetic information to represent a growing and shrinking network; the defining of the network properties such as neuron activation type and network connectivity; and that all of this information is represented in a single gene. Another step forward taken in this thesis on neurofuzzy learning is that of learning online. Online in this case refers to learning unsupervised and adapting to real time system parameter changes. It is much more attractive because the alternative (supervised offline learning) demands quality learning data which is often expensive to obtain, and unrepresentative of and inaccurate about the real environment. First, the learning algorithm is developed for the case of a given model of the system where the system dynamics are available or can be obtained through, for example, system identification. This naturally leads to the development of a method for learning by directly interacting with the environment. The motivation for this is that usually real world applications tend to be large and complex, and obtaining a mathematical model of the plant is not always possible. For this purpose the reinforcement learning paradigm is utilised, which is the primary learning method of biological systems, systems that can adapt to their environment and experiences, in this thesis, the reinforcement learning algorithm is based on the advantage learning method and has been extended to deal with continuous time systems and online implementations, and which does not use a lookup table. This means that large databases containing the system behaviour need not be constructed, and the procedure can work online where the information available is that of the immediate situation. For complex systems of higher order dimensions, and where identifying the system model is difficult, a hierarchical method has been developed and is based on a hybrid of all the other methods developed. In particular, the procedure makes use of a method developed to work directly with plant step response, thus avoiding the need for mathematical model fitting which may be time-consuming and inaccurate. All techniques developed and contributions in the thesis are illustrated by several case studies, and are validated through simulations

Glasgow Theses Service

Way of the Fittest:Optimization by Behavioral Evolution

Author: Stork Jörg Willi
Publication venue
Publication date: 18/01/2022
Field of study

VU Research Portal