Search CORE

125 research outputs found

Computation Approaches for Continuous Reinforcement Learning Problems

Author: Effraimidis D.
Effraimidis D.
Publication venue
Publication date: 01/01/2016
Field of study

Optimisation theory is at the heart of any control process, where we seek to control the behaviour of a system through a set of actions. Linear control problems have been extensively studied, and optimal control laws have been identified. But the world around us is highly non-linear and unpredictable. For these dynamic systems, which don’t possess the nice mathematical properties of the linear counterpart, the classic control theory breaks and other methods have to be employed. But nature thrives by optimising non-linear and over-complicated systems. Evolutionary Computing (EC) methods exploit nature’s way by imitating the evolution process and avoid to solve the control problem analytically. Reinforcement Learning (RL) from the other side regards the optimal control problem as a sequential one. In every discrete time step an action is applied. The transition of the system to a new state is accompanied by a sole numerical value, the “reward” that designate the quality of the control action. Even though the amount of feedback information is limited into a sole real number, the introduction of the Temporal Difference method made possible to have accurate predictions of the value-functions. This paved the way to optimise complex structures, like the Neural Networks, which are used to approximate the value functions. In this thesis we investigate the solution of continuous Reinforcement Learning control problems by EC methodologies. The accumulated reward of such problems throughout an episode suffices as information to formulate the required measure, fitness, in order to optimise a population of candidate solutions. Especially, we explore the limits of applicability of a specific branch of EC, that of Genetic Programming (GP). The evolving population in the GP case is comprised from individuals, which are immediately translated to mathematical functions, which can serve as a control law. The major contribution of this thesis is the proposed unification of these disparate Artificial Intelligence paradigms. The provided information from the systems are exploited by a step by step basis from the RL part of the proposed scheme and by an episodic basis from GP. This makes possible to augment the function set of the GP scheme with adaptable Neural Networks. In the quest to achieve stable behaviour of the RL part of the system a modification of the Actor-Critic algorithm has been implemented. Finally we successfully apply the GP method in multi-action control problems extending the spectrum of the problems that this method has been proved to solve. Also we investigated the capability of GP in relation to problems from the food industry. These type of problems exhibit also non-linearity and there is no definite model describing its behaviour

WestminsterResearch

Online adaptive optimal control for continuous-time nonlinear systems with completely unknown dynamics

Author: Guo Yu
Lv Yongfeng
Na Jing
Wu Xing
Yang Qinmin
Publication venue: 'Informa UK Limited'
Publication date: 31/07/2015
Field of study

Crossref

Explore Bristol Research

Maximum Power Point Tracker Controller for Solar Photovoltaic Based on Reinforcement Learning Agent with a Digital Twin

Author: Artetxe Lázaro Eneko
Barambones Caramazana Oscar
Calvo Gordillo Isidro
Martín Toral Imanol
Uralde Arrue Jokin
Publication venue: 'MDPI AG'
Publication date: 12/05/2023
Field of study

Photovoltaic (PV) energy, representing a renewable source of energy, plays a key role in the reduction of greenhouse gas emissions and the achievement of a sustainable mix of energy generation. To achieve the maximum solar energy harvest, PV power systems require the implementation of Maximum Power Point Tracking (MPPT). Traditional MPPT controllers, such as P&O, are easy to implement, but they are by nature slow and oscillate around the MPP losing efficiency. This work presents a Reinforcement learning (RL)-based control to increase the speed and the efficiency of the controller. Deep Deterministic Policy Gradient (DDPG), the selected RL algorithm, works with continuous actions and space state to achieve a stable output at MPP. A Digital Twin (DT) enables simulation training, which accelerates the process and allows it to operate independent of weather conditions. In addition, we use the maximum power achieved in the DT to adjust the reward function, making the training more efficient. The RL control is compared with a traditional P&O controller to validate the speed and efficiency increase both in simulations and real implementations. The results show an improvement of 10.45% in total power output and a settling time 24.54 times faster in simulations. Moreover, in real-time tests, an improvement of 51.45% in total power output and a 0.25 s settling time of the DDPG compared with 4.26 s of the P&O is obtained

Archivo Digital para la Docencia y la Investigación

Walking Motion Generation and Neuro-Fuzzy Control with Push Recovery for Humanoid Robot

Author: Mendez Monroy Paul Erick
Publication venue: Agora University Press
Publication date: 23/04/2017
Field of study

Push recovery is an essential requirement for a humanoid robot with the objective of safely performing tasks within a real dynamic environment. In this environment, the robot is susceptible to external disturbance that in some cases is inevitable, requiring push recovery strategies to avoid possible falls, damage in humans and the environment. In this paper, a novel push recovery approach to counteract disturbance from any direction and any walking phase is developed. It presents a pattern generator with the ability to be modified according to the push recovery strategy. The result is a humanoid robot that can maintain its balance in the presence of strong disturbance taking into account its magnitude and determining the best push recovery strategy. Push recovery experiments with different disturbance directions have been performed using a 20 DOF Darwin-OP robot. The adaptability and low computational cost of the whole scheme allows is incorporation into an embedded system

Agora University Editing House: Journals

Development and Implementation of Novel Intelligent Motor Control for Performance Enhancement of PMSM Drive in Electrified Vehicle Application

Author: Bhattacharjee Soumava
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/07/2021
Field of study

The demand for electrified vehicles has grown significantly over the last decade causing a shift in the automotive industry from traditional gasoline vehicles to electric vehicles (EVs). With the growing evolution of EVs, high power density, and high efficiency of electric powertrains (e–drive) are of the utmost need to achieve an extended driving range. However, achieving an extended driving range with enhanced e-drive performance is still a bottleneck. The control algorithm of e–drive plays a vital role in its performance and reliability over time. Artificial intelligence (AI) and machine learning (ML) based intelligent control methods have proven their continued success in fault determination and analysis of motor–drive systems. Considering the potential of intelligent control, this thesis investigates the legacy space vector modulation (SVM) strategy for wide–bandgap (WBG) inverter and conventional current PI controller for permanent magnet synchronous motor (PMSM) control to reduce the switching loss, computation time and enhance transient performance in the available state–of–the-art e–drive systems. The thesis converges on AI– and ML–based control for e–drives to enhance the performance by focusing in reducing switching loss using ANN–based modulation technique for GaN–based inverter and improving transient performance of PMSM by incorporating ML–based parameter independent controller

Scholarship at UWindsor

Self-Learning Longitudinal Control for On-Road Vehicles

Author: Puccetti Luca
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 21/02/2023
Field of study

Fahrerassistenzsysteme (Advanced Driver Assistance Systems) sind ein wichtiges Verkaufsargument für PKWs, fordern jedoch hohe Entwicklungskosten. Insbesondere die Parametrierung für Längsregelung, die einen wichtigen Baustein für Fahrerassistenzsysteme darstellt, benötigt viel Zeit und Geld, um die richtige Balance zwischen Insassenkomfort und Regelgüte zu treffen. Reinforcement Learning scheint ein vielversprechender Ansatz zu sein, um dies zu automatisieren. Diese Klasse von Algorithmen wurde bislang allerdings vorwiegend auf simulierte Aufgaben angewendet, die unter idealen Bedingungen stattfinden und nahezu unbegrenzte Trainingszeit ermöglichen. Unter den größten Herausforderungen für die Anwendung von Reinforcement Learning in einem realen Fahrzeug sind Trajektorienfolgeregelung und unvollständige Zustandsinformationen aufgrund von nur teilweise beobachteter Dynamik. Darüber hinaus muss ein Algorithmus, der in realen Systemen angewandt wird, innerhalb von Minuten zu einem Ergebnis kommen. Außerdem kann das Regelziel sich während der Laufzeit beliebig ändern, was eine zusätzliche Schwierigkeit für Reinforcement Learning Methoden darstellt. Diese Arbeit stellt zwei Algorithmen vor, die wenig Rechenleistung benötigen und diese Hürden überwinden. Einerseits wird ein modellfreier Reinforcement Learning Ansatz vorgeschlagen, der auf der Actor-Critic-Architektur basiert und eine spezielle Struktur in der Zustandsaktionswertfunktion verwendet, um mit teilweise beobachteten Systemen eingesetzt werden zu können. Um eine Vorsteuerung zu lernen, wird ein Regler vorgeschlagen, der sich auf eine Projektion und Trainingsdatenmanipulation stützt. Andererseits wird ein modellbasierter Algorithmus vorgeschlagen, der auf Policy Search basiert. Diesem wird eine automatisierte Entwurfsmethode für eine inversionsbasierte Vorsteuerung zur Seite gestellt. Die vorgeschlagenen Algorithmen werden in einer Reihe von Szenarien verglichen, in denen sie online, d.h. während der Fahrt und bei geschlossenem Regelkreis, in einem realen Fahrzeug lernen. Obwohl die Algorithmen etwas unterschiedlich auf verschiedene Randbedingungen reagieren, lernen beide robust und zügig und sind in der Lage, sich an verschiedene Betriebspunkte, wie zum Beispiel Geschwindigkeiten und Gänge, anzupassen, auch wenn Störungen während des Trainings einwirken. Nach bestem Wissen des Autors ist dies die erste erfolgreiche Anwendung eines Reinforcement Learning Algorithmus, der online in einem realen Fahrzeug lernt

KITopen

Hybrid Modeling Approaches Integrating Physics-Based Models with Machine Learning for Predictive Control of Biological and Chemical Processes

Author: Bangi Mohammed Saad Faizan
Publication venue
Publication date: 26/05/2023
Field of study

Recently, there has been growing interest in data-based modeling as the amount of data available has increased tremendously. One such method is Dynamic Mode Decomposition with Control technique, which builds temporally local linear models using data. But its limited domain of applicability (DA) hinders its use for prediction purposes. To overcome this challenge, we proposed an algorithm that utilizes multiple "local" training datasets, and it was applied successfully to hydraulic fracturing. Although data-based modeling offers simplicity and ease of construction, it lacks robustness and parametric interpretability, unlike first-principles modeling. To balance the advantages and disadvantages of data-based models and first-principles models, hybrid modeling was proposed using artificial neural networks (ANNs). Since then, Machine Learning (ML) has advanced where deep neural networks (DNNs) with more than three layers can be trained to approximate any function accurately. In this work, we proposed a deep hybrid modeling (DHM) framework that integrates first-principles with DNNs and successfully applied it to two complex processes, i.e., hydraulic fracturing and full-scale fermentation reactor. Similarly, Universal Differential Equations (UDEs) was proposed in ML where DNNs are represented as ODEs and solved using ODE solvers. We utilized UDEs to successfully build a DHM using simulation and experimental data for batch production of ϐ-carotene. One limitation of DHM is that its DA is affected by the DNN within it, and its accuracy is high within its DA. Therefore, it is important to consider its DA when designing a model-based controller. To this end, we proposed a Control Lyapunov-Barrier Function (CLBF)-MPC to stabilize and ensure that the closed-loop system stays within DA of DHM. Theoretical guarantees were provided for the CLBF-MPC controller, and it was successfully implemented on a CSTR. The idea of integrating physics with ML can be extended to Reinforcement Learning (RL). In case when model-based controller design is not possible, we proposed a model-free Deep RL (DRL) controller that utilizes prior knowledge in its reward function to quicken the learning process. This DRL controller was successfully applied to hydraulic fracturing wherein Nolte’s law was included in the reward function for fast convergence

Texas A&M Repository