3,697 research outputs found

    Developmental acquisition of entrainment skills in robot swinging using van der Pol oscillators

    Get PDF
    In this study we investigated the effects of different morphological configurations on a robot swinging task using van der Pol oscillators. The task was examined using two separate degrees of freedom (DoF), both in the presence and absence of neural entrainment. Neural entrainment stabilises the system, reduces time-to-steady state and relaxes the requirement for a strong coupling with the environment in order to achieve mechanical entrainment. It was found that staged release of the distal DoF does not have any benefits over using both DoF from the onset of the experimentation. On the contrary, it is less efficient, both with respect to the time needed to reach a stable oscillatory regime and the maximum amplitude it can achieve. The same neural architecture is successful in achieving neuromechanical entrainment for a robotic walking task

    Online optimal and adaptive integral tracking control for varying discrete‐time systems using reinforcement learning

    Get PDF
    Conventional closed‐form solution to the optimal control problem using optimal control theory is only available under the assumption that there are known system dynamics/models described as differential equations. Without such models, reinforcement learning (RL) as a candidate technique has been successfully applied to iteratively solve the optimal control problem for unknown or varying systems. For the optimal tracking control problem, existing RL techniques in the literature assume either the use of a predetermined feedforward input for the tracking control, restrictive assumptions on the reference model dynamics, or discounted tracking costs. Furthermore, by using discounted tracking costs, zero steady‐state error cannot be guaranteed by the existing RL methods. This article therefore presents an optimal online RL tracking control framework for discrete‐time (DT) systems, which does not impose any restrictive assumptions of the existing methods and equally guarantees zero steady‐state tracking error. This is achieved by augmenting the original system dynamics with the integral of the error between the reference inputs and the tracked outputs for use in the online RL framework. It is further shown that the resulting value function for the DT linear quadratic tracker using the augmented formulation with integral control is also quadratic. This enables the development of Bellman equations, which use only the system measurements to solve the corresponding DT algebraic Riccati equation and obtain the optimal tracking control inputs online. Two RL strategies are thereafter proposed based on both the value function approximation and the Q‐learning along with bounds on excitation for the convergence of the parameter estimates. Simulation case studies show the effectiveness of the proposed approach

    Optimal adaptive control of time-delay dynamical systems with known and uncertain dynamics

    Get PDF
    Delays are found in many industrial pneumatic and hydraulic systems, and as a result, the performance of the overall closed-loop system deteriorates unless they are explicitly accounted. It is also possible that the dynamics of such systems are uncertain. On the other hand, optimal control of time-delay systems in the presence of known and uncertain dynamics by using state and output feedback is of paramount importance. Therefore, in this research, a suite of novel optimal adaptive control (OAC) techniques are undertaken for linear and nonlinear continuous time-delay systems in the presence of uncertain system dynamics using state and/or output feedback. First, the optimal regulation of linear continuous-time systems with state and input delays by utilizing a quadratic cost function over infinite horizon is addressed using state and output feedback. Next, the optimal adaptive regulation is extended to uncertain linear continuous-time systems under a mild assumption that the bounds on system matrices are known. Subsequently, the event-triggered optimal adaptive regulation of partially unknown linear continuous time systems with state-delay is addressed by using integral reinforcement learning (IRL). It is demonstrated that the optimal control policy renders asymptotic stability of the closed-loop system provided the linear time-delayed system is controllable and observable. The proposed event-triggered approach relaxed the need for continuous availability of state vector and proven to be zeno-free. Finally, the OAC using IRL neural network based control of uncertain nonlinear time-delay systems with input and state delays is investigated. An identifier is proposed for nonlinear time-delay systems to approximate the system dynamics and relax the need for the control coefficient matrix in generating the control policy. Lyapunov analysis is utilized to design the optimal adaptive controller, derive parameter/weight tuning law and verify stability of the closed-loop system”--Abstract, page iv

    Reinforcement Learning for UAV Attitude Control

    Full text link
    Autopilot systems are typically composed of an "inner loop" providing stability and control, while an "outer loop" is responsible for mission-level objectives, e.g. way-point navigation. Autopilot systems for UAVs are predominately implemented using Proportional, Integral Derivative (PID) control systems, which have demonstrated exceptional performance in stable environments. However more sophisticated control is required to operate in unpredictable, and harsh environments. Intelligent flight control systems is an active area of research addressing limitations of PID control most recently through the use of reinforcement learning (RL) which has had success in other applications such as robotics. However previous work has focused primarily on using RL at the mission-level controller. In this work, we investigate the performance and accuracy of the inner control loop providing attitude control when using intelligent flight control systems trained with the state-of-the-art RL algorithms, Deep Deterministic Gradient Policy (DDGP), Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO). To investigate these unknowns we first developed an open-source high-fidelity simulation environment to train a flight controller attitude control of a quadrotor through RL. We then use our environment to compare their performance to that of a PID controller to identify if using RL is appropriate in high-precision, time-critical flight control.Comment: 13 pages, 9 figure

    Issues on Stability of ADP Feedback Controllers for Dynamical Systems

    Get PDF
    This paper traces the development of neural-network (NN)-based feedback controllers that are derived from the principle of adaptive/approximate dynamic programming (ADP) and discusses their closed-loop stability. Different versions of NN structures in the literature, which embed mathematical mappings related to solutions of the ADP-formulated problems called “adaptive critics” or “action-critic” networks, are discussed. Distinction between the two classes of ADP applications is pointed out. Furthermore, papers in “model-free” development and model-based neurocontrollers are reviewed in terms of their contributions to stability issues. Recent literature suggests that work in ADP-based feedback controllers with assured stability is growing in diverse forms

    Reinforcement Learning, Intelligent Control and their Applications in Connected and Autonomous Vehicles

    Get PDF
    Reinforcement learning (RL) has attracted large attention over the past few years. Recently, we developed a data-driven algorithm to solve predictive cruise control (PCC) and games output regulation problems. This work integrates our recent contributions to the application of RL in game theory, output regulation problems, robust control, small-gain theory and PCC. The algorithm was developed for H∞H_\infty adaptive optimal output regulation of uncertain linear systems, and uncertain partially linear systems to reject disturbance and also force the output of the systems to asymptotically track a reference. In the PCC problem, we determined the reference velocity for each autonomous vehicle in the platoon using the traffic information broadcasted from the lights to reduce the vehicles\u27 trip time. Then we employed the algorithm to design an approximate optimal controller for the vehicles. This controller is able to regulate the headway, velocity and acceleration of each vehicle to the desired values. Simulation results validate the effectiveness of the algorithms
    • 

    corecore