Search CORE

8 research outputs found

Learning Throttle Valve Control Using Policy Search

Author: Alois Knoll
Bastian Bischoff
Duy Nguyen-tuong
Heiner Markert
Torsten Koller
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Abstract. The throttle valve is a technical device used for regulating a fluid or a gas flow. Throttle valve control is a challenging task, due to its complex dynamics and demanding constraints for the controller. Using state-of-the-art throttle valve control, such as model-free PID controllers, time-consuming and manual adjusting of the controller is necessary. In this paper, we investigate how reinforcement learning (RL) can help to alleviate the effort of manual controller design by automatically learning a control policy from experiences. In order to obtain a valid control policy for the throttle valve, several constraints need to be addressed, such as no-overshoot. Furthermore, the learned controller must be able to follow given desired trajectories, while moving the valve from any start to any goal position and, thus, multi-targets policy learning needs to be considered for RL. In this study, we employ a policy search RL approach, Pilco [2], to learn a throttle valve control policy. We adapt the Pilco algorithm, while taking into account the practical requirements and constraints for the controller. For evaluation, we employ the resulting algorithm to solve several control tasks in simulation, as well as on a physical throttle valve system. The results show that policy search RL is able to learn a consistent control policy for complex, real-world systems.

CiteSeerX

Crossref

Model-Based Policy Search for Automatic Tuning of Multivariate PID Controllers

Author: Doerr Andreas
Marco Alonso
Nguyen-Tuong Duy
Schaal Stefan
Trimpe Sebastian
Publication venue
Publication date: 08/03/2017
Field of study

PID control architectures are widely used in industrial applications. Despite their low number of open parameters, tuning multiple, coupled PID controllers can become tedious in practice. In this paper, we extend PILCO, a model-based policy search framework, to automatically tune multivariate PID controllers purely based on data observed on an otherwise unknown system. The system's state is extended appropriately to frame the PID policy as a static state feedback policy. This renders PID tuning possible as the solution of a finite horizon optimal control problem without further a priori knowledge. The framework is applied to the task of balancing an inverted pendulum on a seven degree-of-freedom robotic arm, thereby demonstrating its capabilities of fast and data-efficient policy learning, even on complex real world problems.Comment: Accepted final version to appear in 2017 IEEE International Conference on Robotics and Automation (ICRA

arXiv.org e-Print Archive

Crossref

An Innovative MIMO Iterative Learning Control Approach for the Position Control of a Hydraulic Press

Author: Elorza Iker
Irigoyen Gordo Eloy
Pujana Arrese Aron
Sorrosal Yarritu Gorka
Trojaola Bolinaga Ignacio
Publication venue: IEEE
Publication date: 01/01/2021
Field of study

To improve the performance of hydraulic press position control and eliminate the need to manually define control signals, this paper proposes a multi-input-multi-output (MIMO) Iterative Learning Control (ILC) algorithm. The MIMO ILC algorithm design is based on the inversion of the known low frequency dynamics of the hydraulic press, whereas the unknown and uncertain high frequency dynamics are discarded due to their low influence in the learning transient. Moreover, for the MIMO ILC convergence condition, a graphical method is proposed, in which the ILC learning filter eigenvalues are analyzed. This method allows studying the stability and convergence rate of the algorithm intuitively. Theoretical analysis and results prove that with the MIMO ILC algorithm the position control is automated and that high precision in the position tracking is gained. A comparison with other model inverse ILC approaches is carried out and it is shown that the proposed MIMO ILC algorithm outperforms the existing algorithms, reducing the number of iterations required to converge while guaranteeing system stability. Furthermore, experimental results in a hydraulic test rig are presented and compared to those obtained with a conventional PI controllerThis work was supported in part by the Department of Development and Infrastructures of the Government of the Basque Country via Industrial Doctorate Program BIKAINTEK under Grant 20-AF-W2-2018-00015

Directory of Open Access Journals

Archivo Digital para la Docencia y la Investigación

Flexible and robust control of heavy duty diesel engine airpath using data driven disturbance observers and GPR models

Author: Aran Volkan
Publication venue
Publication date: 12/07/2019
Field of study

Diesel engine airpath control is crucial for modern engine development due to increasingly stringent emission regulations. This thesis aims to develop and validate a exible and robust control approach to this problem for speci cally heavy-duty engines. It focuses on estimation and control algorithms that are implementable to the current and next generation commercial electronic control units (ECU). To this end, targeting the control units in service, a data driven disturbance observer (DOB) is developed and applied for mass air ow (MAF) and manifold absolute pressure (MAP) tracking control via exhaust gas recirculation (EGR) valve and variable geometry turbine (VGT) vane. Its performance bene ts are demonstrated on the physical engine model for concept evaluation. The proposed DOB integrated with a discrete-time sliding mode controller is applied to the serial level engine control unit. Real engine performance is validated with the legal emission test cycle (WHTC - World Harmonized Transient Cycle) for heavy-duty engines and comparison with a commercially available controller is performed, and far better tracking results are obtained. Further studies are conducted in order to utilize capabilities of the next generation control units. Gaussian process regression (GPR) models are popular in automotive industry especially for emissions modeling but have not found widespread applications in airpath control yet. This thesis presents a GPR modeling of diesel engine airpath components as well as controller designs and their applications based on the developed models. Proposed GPR based feedforward and feedback controllers are validated with available physical engine models and the results have been very promisin

Sabanci University Research Database

Probabilistic models for data efficient reinforcement learning

Author: Kamthe Sanket
Publication venue: Computing, Imperial College London
Publication date: 01/11/2021
Field of study

Trial-and-error based reinforcement learning (RL) has seen rapid advancements in recent times, especially with the advent of deep neural networks. However, the standard deep learning methods often overlook the progress made in control theory by treating systems as black-box. We propose a model-based RL framework based on probabilistic Model Predictive Control (MPC). In particular, we propose to learn a probabilistic transition model using Gaussian Processes (GPs) to incorporate model uncertainty into long-term predictions, thereby, reducing the impact of model errors. We provide theoretical guarantees for first-order optimality in the GP-based transition models with deterministic approximate inference for long-term planning. We demonstrate that our approach not only achieves the state-of-the-art data efficiency, but also is a principled way for RL in constrained environments. When the true state of the dynamical system cannot be fully observed the standard model based methods cannot be directly applied. For these systems an additional step of state estimation is needed. We propose distributed message passing for state estimation in non-linear dynamical systems. In particular, we propose to use expectation propagation (EP) to iteratively refine the state estimate, i.e., the Gaussian posterior distribution on the latent state. We show two things: (a) Classical Rauch-Tung-Striebel (RTS) smoothers, such as the extended Kalman smoother (EKS) or the unscented Kalman smoother (UKS), are special cases of our message passing scheme; (b) running the message passing scheme more than once can lead to significant improvements over the classical RTS smoothers. We show the explicit connection between message passing with EP and well-known RTS smoothers and provide a practical implementation of the suggested algorithm. Furthermore, we address convergence issues of EP by generalising this framework to damped updates and the consideration of general -divergences. Probabilistic models can also be used to generate synthetic data. In model based RL we use ’synthetic’ data as a proxy to real environments and in order to achieve high data efficiency. The ability to generate high-fidelity synthetic data is crucial when available (real) data is limited as in RL or where privacy and data protection standards allow only for limited use of the given data, e.g., in medical and financial data-sets. Current state-of-the-art methods for synthetic data generation are based on generative models, such as Generative Adversarial Networks (GANs). Even though GANs have achieved remarkable results in synthetic data generation, they are often challenging to interpret. Furthermore, GAN-based methods can suffer when used with mixed real and categorical variables. Moreover, the loss function (discriminator loss) design itself is problem specific, i.e., the generative model may not be useful for tasks it was not explicitly trained for. In this paper, we propose to use a probabilistic model as a synthetic data generator. Learning the probabilistic model for the data is equivalent to estimating the density of the data. Based on the copula theory, we divide the density estimation task into two parts, i.e., estimating univariate marginals and estimating the multivariate copula density over the univariate marginals. We use normalising flows to learn both the copula density and univariate marginals. We benchmark our method on both simulated and real data-sets in terms of density estimation as well as the ability to generate high-fidelity synthetic data.Open Acces

Spiral - Imperial College Digital Repository

Volume 1 – Symposium

Author
Publication venue: Technische Universität Dresden
Publication date: 22/06/2020
Field of study

We are pleased to present the conference proceedings for the 12th edition of the International Fluid Power Conference (IFK). The IFK is one of the world’s most significant scientific conferences on fluid power control technology and systems. It offers a common platform for the presentation and discussion of trends and innovations to manufacturers, users and scientists. The Chair of Fluid-Mechatronic Systems at the TU Dresden is organizing and hosting the IFK for the sixth time. Supporting hosts are the Fluid Power Association of the German Engineering Federation (VDMA), Dresdner Verein zur Förderung der Fluidtechnik e. V. (DVF) and GWT-TUD GmbH. The organization and the conference location alternates every two years between the Chair of Fluid-Mechatronic Systems in Dresden and the Institute for Fluid Power Drives and Systems in Aachen. The symposium on the first day is dedicated to presentations focused on methodology and fundamental research. The two following conference days offer a wide variety of application and technology orientated papers about the latest state of the art in fluid power. It is this combination that makes the IFK a unique and excellent forum for the exchange of academic research and industrial application experience. A simultaneously ongoing exhibition offers the possibility to get product information and to have individual talks with manufacturers. The theme of the 12th IFK is “Fluid Power – Future Technology”, covering topics that enable the development of 5G-ready, cost-efficient and demand-driven structures, as well as individual decentralized drives. Another topic is the real-time data exchange that allows the application of numerous predictive maintenance strategies, which will significantly increase the availability of fluid power systems and their elements and ensure their improved lifetime performance. We create an atmosphere for casual exchange by offering a vast frame and cultural program. This includes a get-together, a conference banquet, laboratory festivities and some physical activities such as jogging in Dresden’s old town.:Group A: Materials Group B: System design & integration Group C: Novel system solutions Group D: Additive manufacturing Group E: Components Group F: Intelligent control Group G: Fluids Group H | K: Pumps Group I | L: Mobile applications Group J: Fundamental

Technische Universität Dresden: Qucosa