3,697 research outputs found
Developmental acquisition of entrainment skills in robot swinging using van der Pol oscillators
In this study we investigated the effects of different
morphological configurations on a robot swinging
task using van der Pol oscillators. The task was
examined using two separate degrees of freedom
(DoF), both in the presence and absence of neural
entrainment. Neural entrainment stabilises the
system, reduces time-to-steady state and relaxes the
requirement for a strong coupling with the
environment in order to achieve mechanical
entrainment. It was found that staged release of the
distal DoF does not have any benefits over using both
DoF from the onset of the experimentation. On the
contrary, it is less efficient, both with respect to the
time needed to reach a stable oscillatory regime and
the maximum amplitude it can achieve. The same
neural architecture is successful in achieving
neuromechanical entrainment for a robotic walking
task
Online optimal and adaptive integral tracking control for varying discreteâtime systems using reinforcement learning
Conventional closedâform solution to the optimal control problem using optimal control theory is only available under the assumption that there are known system dynamics/models described as differential equations. Without such models, reinforcement learning (RL) as a candidate technique has been successfully applied to iteratively solve the optimal control problem for unknown or varying systems. For the optimal tracking control problem, existing RL techniques in the literature assume either the use of a predetermined feedforward input for the tracking control, restrictive assumptions on the reference model dynamics, or discounted tracking costs. Furthermore, by using discounted tracking costs, zero steadyâstate error cannot be guaranteed by the existing RL methods. This article therefore presents an optimal online RL tracking control framework for discreteâtime (DT) systems, which does not impose any restrictive assumptions of the existing methods and equally guarantees zero steadyâstate tracking error. This is achieved by augmenting the original system dynamics with the integral of the error between the reference inputs and the tracked outputs for use in the online RL framework. It is further shown that the resulting value function for the DT linear quadratic tracker using the augmented formulation with integral control is also quadratic. This enables the development of Bellman equations, which use only the system measurements to solve the corresponding DT algebraic Riccati equation and obtain the optimal tracking control inputs online. Two RL strategies are thereafter proposed based on both the value function approximation and the Qâlearning along with bounds on excitation for the convergence of the parameter estimates. Simulation case studies show the effectiveness of the proposed approach
Optimal adaptive control of time-delay dynamical systems with known and uncertain dynamics
Delays are found in many industrial pneumatic and hydraulic systems, and as a result, the performance of the overall closed-loop system deteriorates unless they are explicitly accounted. It is also possible that the dynamics of such systems are uncertain. On the other hand, optimal control of time-delay systems in the presence of known and uncertain dynamics by using state and output feedback is of paramount importance. Therefore, in this research, a suite of novel optimal adaptive control (OAC) techniques are undertaken for linear and nonlinear continuous time-delay systems in the presence of uncertain system dynamics using state and/or output feedback.
First, the optimal regulation of linear continuous-time systems with state and input delays by utilizing a quadratic cost function over infinite horizon is addressed using state and output feedback. Next, the optimal adaptive regulation is extended to uncertain linear continuous-time systems under a mild assumption that the bounds on system matrices are known. Subsequently, the event-triggered optimal adaptive regulation of partially unknown linear continuous time systems with state-delay is addressed by using integral reinforcement learning (IRL). It is demonstrated that the optimal control policy renders asymptotic stability of the closed-loop system provided the linear time-delayed system is controllable and observable. The proposed event-triggered approach relaxed the need for continuous availability of state vector and proven to be zeno-free.
Finally, the OAC using IRL neural network based control of uncertain nonlinear time-delay systems with input and state delays is investigated. An identifier is proposed for nonlinear time-delay systems to approximate the system dynamics and relax the need for the control coefficient matrix in generating the control policy. Lyapunov analysis is utilized to design the optimal adaptive controller, derive parameter/weight tuning law and verify stability of the closed-loop systemâ--Abstract, page iv
Reinforcement Learning for UAV Attitude Control
Autopilot systems are typically composed of an "inner loop" providing
stability and control, while an "outer loop" is responsible for mission-level
objectives, e.g. way-point navigation. Autopilot systems for UAVs are
predominately implemented using Proportional, Integral Derivative (PID) control
systems, which have demonstrated exceptional performance in stable
environments. However more sophisticated control is required to operate in
unpredictable, and harsh environments. Intelligent flight control systems is an
active area of research addressing limitations of PID control most recently
through the use of reinforcement learning (RL) which has had success in other
applications such as robotics. However previous work has focused primarily on
using RL at the mission-level controller. In this work, we investigate the
performance and accuracy of the inner control loop providing attitude control
when using intelligent flight control systems trained with the state-of-the-art
RL algorithms, Deep Deterministic Gradient Policy (DDGP), Trust Region Policy
Optimization (TRPO) and Proximal Policy Optimization (PPO). To investigate
these unknowns we first developed an open-source high-fidelity simulation
environment to train a flight controller attitude control of a quadrotor
through RL. We then use our environment to compare their performance to that of
a PID controller to identify if using RL is appropriate in high-precision,
time-critical flight control.Comment: 13 pages, 9 figure
Issues on Stability of ADP Feedback Controllers for Dynamical Systems
This paper traces the development of neural-network (NN)-based feedback controllers that are derived from the principle of adaptive/approximate dynamic programming (ADP) and discusses their closed-loop stability. Different versions of NN structures in the literature, which embed mathematical mappings related to solutions of the ADP-formulated problems called âadaptive criticsâ or âaction-criticâ networks, are discussed. Distinction between the two classes of ADP applications is pointed out. Furthermore, papers in âmodel-freeâ development and model-based neurocontrollers are reviewed in terms of their contributions to stability issues. Recent literature suggests that work in ADP-based feedback controllers with assured stability is growing in diverse forms
Reinforcement Learning, Intelligent Control and their Applications in Connected and Autonomous Vehicles
Reinforcement learning (RL) has attracted large attention over the past few years. Recently, we developed a data-driven algorithm to solve predictive cruise control (PCC) and games output regulation problems. This work integrates our recent contributions to the application of RL in game theory, output regulation problems, robust control, small-gain theory and PCC. The algorithm was developed for adaptive optimal output regulation of uncertain linear systems, and uncertain partially linear systems to reject disturbance and also force the output of the systems to asymptotically track a reference. In the PCC problem, we determined the reference velocity for each autonomous vehicle in the platoon using the traffic information broadcasted from the lights to reduce the vehicles\u27 trip time. Then we employed the algorithm to design an approximate optimal controller for the vehicles. This controller is able to regulate the headway, velocity and acceleration of each vehicle to the desired values. Simulation results validate the effectiveness of the algorithms
- âŠ