55 research outputs found

    Event-triggered near optimal adaptive control of interconnected systems

    Get PDF
    Increased interest in complex interconnected systems like smart-grid, cyber manufacturing have attracted researchers to develop optimal adaptive control schemes to elicit a desired performance when the complex system dynamics are uncertain. In this dissertation, motivated by the fact that aperiodic event sampling saves network resources while ensuring system stability, a suite of novel event-sampled distributed near-optimal adaptive control schemes are introduced for uncertain linear and affine nonlinear interconnected systems in a forward-in-time and online manner. First, a novel stochastic hybrid Q-learning scheme is proposed to generate optimal adaptive control law and to accelerate the learning process in the presence of random delays and packet losses resulting from the communication network for an uncertain linear interconnected system. Subsequently, a novel online reinforcement learning (RL) approach is proposed to solve the Hamilton-Jacobi-Bellman (HJB) equation by using neural networks (NNs) for generating distributed optimal control of nonlinear interconnected systems using state and output feedback. To relax the state vector measurements, distributed observers are introduced. Next, using RL, an improved NN learning rule is derived to solve the HJB equation for uncertain nonlinear interconnected systems with event-triggered feedback. Distributed NN identifiers are introduced both for approximating the uncertain nonlinear dynamics and to serve as a model for online exploration. Next, the control policy and the event-sampling errors are considered as non-cooperative players and a min-max optimization problem is formulated for linear and affine nonlinear systems by using zero-sum game approach for simultaneous optimization of both the control policy and the event based sampling instants. The net result is the development of optimal adaptive event-triggered control of uncertain dynamic systems --Abstract, page iv

    Online optimal and adaptive integral tracking control for varying discrete‐time systems using reinforcement learning

    Get PDF
    Conventional closed‐form solution to the optimal control problem using optimal control theory is only available under the assumption that there are known system dynamics/models described as differential equations. Without such models, reinforcement learning (RL) as a candidate technique has been successfully applied to iteratively solve the optimal control problem for unknown or varying systems. For the optimal tracking control problem, existing RL techniques in the literature assume either the use of a predetermined feedforward input for the tracking control, restrictive assumptions on the reference model dynamics, or discounted tracking costs. Furthermore, by using discounted tracking costs, zero steady‐state error cannot be guaranteed by the existing RL methods. This article therefore presents an optimal online RL tracking control framework for discrete‐time (DT) systems, which does not impose any restrictive assumptions of the existing methods and equally guarantees zero steady‐state tracking error. This is achieved by augmenting the original system dynamics with the integral of the error between the reference inputs and the tracked outputs for use in the online RL framework. It is further shown that the resulting value function for the DT linear quadratic tracker using the augmented formulation with integral control is also quadratic. This enables the development of Bellman equations, which use only the system measurements to solve the corresponding DT algebraic Riccati equation and obtain the optimal tracking control inputs online. Two RL strategies are thereafter proposed based on both the value function approximation and the Q‐learning along with bounds on excitation for the convergence of the parameter estimates. Simulation case studies show the effectiveness of the proposed approach

    Suboptimal Safety-Critical Control for Continuous Systems Using Prediction-Correction Online Optimization

    Full text link
    This paper investigates the control barrier function (CBF) based safety-critical control for continuous nonlinear control affine systems using more efficient online algorithms by the time-varying optimization method. The idea of the algorithms is that when quadratic programming (QP) or other convex optimization algorithms needed in the CBF-based method is not computation affordable, the alternative suboptimal feasible solutions can be obtained more economically. By using the barrier-based interior point method, the constrained CBF-QP problems are transformed into unconstrained ones with suboptimal solutions tracked by two continuous descent-based algorithms. Considering the lag effect of tracking and exploiting the system information, the prediction method is added to the algorithms, which achieves exponential convergence to the time-varying suboptimal solutions. The convergence and robustness of the designed methods as well as the safety criteria of the algorithms are studied theoretically. The effectiveness is illustrated by simulations on the anti-swing and obstacle avoidance tasks

    Stochastic optimal adaptive controller and communication protocol design for networked control systems

    Get PDF
    Networked Control System (NCS) is a recent topic of research wherein the feedback control loops are closed through a real-time communication network. Many design challenges surface in such systems due to network imperfections such as random delays, packet losses, quantization effects and so on. Since existing control techniques are unsuitable for such systems, in this dissertation, a suite of novel stochastic optimal adaptive design methodologies is undertaken for both linear and nonlinear NCS in presence of uncertain system dynamics and unknown network imperfections such as network-induced delays and packet losses. The design is introduced in five papers. In Paper 1, a stochastic optimal adaptive control design is developed for unknown linear NCS with uncertain system dynamics and unknown network imperfections. A value function is adjusted forward-in-time and online, and a novel update law is proposed for tuning value function estimator parameters. Additionally, by using estimated value function, optimal adaptive control law is derived based on adaptive dynamic programming technique. Subsequently, this design methodology is extended to solve stochastic optimal strategies of linear NCS zero-sum games in Paper 2. Since most systems are inherently nonlinear, a novel stochastic optimal adaptive control scheme is then developed in Paper 3 for nonlinear NCS with unknown network imperfections. On the other hand, in Paper 4, the network protocol behavior (e.g. TCP and UDP) are considered and optimal adaptive control design is revisited using output feedback for linear NCS. Finally, Paper 5 explores a co-design framework where both the controller and network scheduling protocol designs are addressed jointly so that proposed scheme can be implemented into next generation Cyber Physical Systems --Abstract, page iv

    Interaction dynamics and autonomy in cognitive systems

    Get PDF
    The concept of autonomy is of crucial importance for understanding life and cognition. Whereas cellular and organismic autonomy is based in the self-production of the material infrastructure sustaining the existence of living beings as such, we are interested in how biological autonomy can be expanded into forms of autonomous agency, where autonomy as a form of organization is extended into the behaviour of an agent in interaction with its environment (and not its material self-production). In this thesis, we focus on the development of operational models of sensorimotor agency, exploring the construction of a domain of interactions creating a dynamical interface between agent and environment. We present two main contributions to the study of autonomous agency: First, we contribute to the development of a modelling route for testing, comparing and validating hypotheses about neurocognitive autonomy. Through the design and analysis of specific neurodynamical models embedded in robotic agents, we explore how an agent is constituted in a sensorimotor space as an autonomous entity able to adaptively sustain its own organization. Using two simulation models and different dynamical analysis and measurement of complex patterns in their behaviour, we are able to tackle some theoretical obstacles preventing the understanding of sensorimotor autonomy, and to generate new predictions about the nature of autonomous agency in the neurocognitive domain. Second, we explore the extension of sensorimotor forms of autonomy into the social realm. We analyse two cases from an experimental perspective: the constitution of a collective subject in a sensorimotor social interactive task, and the emergence of an autonomous social identity in a large-scale technologically-mediated social system. Through the analysis of coordination mechanisms and emergent complex patterns, we are able to gather experimental evidence indicating that in some cases social autonomy might emerge based on mechanisms of coordinated sensorimotor activity and interaction, constituting forms of collective autonomous agency

    Human optional stopping in a heteroscedastic world

    Get PDF
    When making decisions, animals must trade off the benefits of information harvesting against the opportunity cost of prolonged deliberation. Deciding when to stop accumulating information and commit to a choice is challenging in natural environments, where the reliability of decision-relevant information may itself vary unpredictably over time (variable variance or "heteroscedasticity"). We asked humans to perform a categorization task in which discrete, continuously valued samples (oriented gratings) arrived in series until the observer made a choice. Human behavior was best described by a model that adaptively weighted sensory signals by their inverse prediction error and integrated the resulting quantities with a linear urgency signal to a decision threshold. This model approximated the output of a Bayesian model that computed the full posterior probability of a correct response, and successfully predicted adaptive weighting of decision information in neural signals. Adaptive weighting of decision information may have evolved to promote optional stopping in heteroscedastic natural environments. (PsycInfo Database Record (c) 2021 APA, all rights reserved)

    Observer-based event-triggered and set-theoretic neuro-adaptive controls for constrained uncertain systems

    Get PDF
    In this study, several new observer-based event-triggered and set-theoretic control schemes are presented to advance the state of the art in neuro-adaptive controls. In the first part, six new event-triggered neuro-adaptive control (ETNAC) schemes are presented for uncertain linear systems. These comprehensive designs offer flexibility to choose a design depending upon system performance requirements. Stability proofs for each scheme are presented and their performance is analyzed using benchmark examples. In the second part, the scope of the ETNAC is extended to uncertain nonlinear systems. It is applied to a case of precision formation flight of the microsatellites at the Sun-Earth/Moon L1 libration point. This dynamic system is selected to evaluate the performance of the ETNAC techniques in a setting that is highly nonlinear and chaotic in nature. Moreover, factors like restricted controls, response to uncertainties and jittering makes the controller design even trickier for maintaining a tight formation precision. Lyapunov function-based stability analysis and numerical results are presented. Note that most real-world systems involve constraints due to hardware limitations, disturbances, uncertainties, nonlinearities, and cannot always be efficiently controlled by using linearized models. To address all these issues simultaneously, a barrier Lyapunov function-based control architecture called the segregated prescribed performance guaranteeing neuro-adaptive control is developed and tested for the constrained uncertain nonlinear systems, in the third part. It guarantees strict performance that can be independently prescribed for each individual state and/or error signal of the given system. Furthermore, the proposed technique can identify unknown dynamics/uncertainties online and provides a way to regulate the control input --Abstract, page iv

    Adaptive and learning-based formation control of swarm robots

    Get PDF
    Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation
    corecore