Search CORE

23,807 research outputs found

Control of Complex Dynamic Systems by Neural Networks

Author: Cristion John A.
Spall James C.
Publication venue
Publication date: 01/02/1993
Field of study

This paper considers the use of neural networks (NN's) in controlling a nonlinear, stochastic system with unknown process equations. The NN is used to model the resulting unknown control law. The approach here is based on using the output error of the system to train the NN controller without the need to construct a separate model (NN or other type) for the unknown process dynamics. To implement such a direct adaptive control approach, it is required that connection weights in the NN be estimated while the system is being controlled. As a result of the feedback of the unknown process dynamics, however, it is not possible to determine the gradient of the loss function for use in standard (back-propagation-type) weight estimation algorithms. Therefore, this paper considers the use of a new stochastic approximation algorithm for this weight estimation, which is based on a 'simultaneous perturbation' gradient approximation that only requires the system output error. It is shown that this algorithm can greatly enhance the efficiency over more standard stochastic approximation algorithms based on finite-difference gradient approximations

NASA Technical Reports Server

Active Classification for POMDPs: a Kalman-like State Estimator

Author: Levorato Marco
Mitra Urbashi
Zois Daphney-Stavroula
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/12/2013
Field of study

The problem of state tracking with active observation control is considered for a system modeled by a discrete-time, finite-state Markov chain observed through conditionally Gaussian measurement vectors. The measurement model statistics are shaped by the underlying state and an exogenous control input, which influence the observations' quality. Exploiting an innovations approach, an approximate minimum mean-squared error (MMSE) filter is derived to estimate the Markov chain system state. To optimize the control strategy, the associated mean-squared error is used as an optimization criterion in a partially observable Markov decision process formulation. A stochastic dynamic programming algorithm is proposed to solve for the optimal solution. To enhance the quality of system state estimates, approximate MMSE smoothing estimators are also derived. Finally, the performance of the proposed framework is illustrated on the problem of physical activity detection in wireless body sensing networks. The power of the proposed framework lies within its ability to accommodate a broad spectrum of active classification applications including sensor management for object classification and tracking, estimation of sparse signals and radar scheduling.Comment: 38 pages, 6 figure

arXiv.org e-Print Archive

eScholarship - University of California

Actor-Critic Reinforcement Learning for Control with Stability Guarantee

Author: Han Minghao
Pan Wei
Wang Jun
Zhang Lixian
Publication venue
Publication date: 15/07/2020
Field of study

Reinforcement Learning (RL) and its integration with deep learning have achieved impressive performance in various robotic control tasks, ranging from motion planning and navigation to end-to-end visual manipulation. However, stability is not guaranteed in model-free RL by solely using data. From a control-theoretic perspective, stability is the most important property for any control system, since it is closely related to safety, robustness, and reliability of robotic systems. In this paper, we propose an actor-critic RL framework for control which can guarantee closed-loop stability by employing the classic Lyapunov's method in control theory. First of all, a data-based stability theorem is proposed for stochastic nonlinear systems modeled by Markov decision process. Then we show that the stability condition could be exploited as the critic in the actor-critic RL to learn a controller/policy. At last, the effectiveness of our approach is evaluated on several well-known 3-dimensional robot control tasks and a synthetic biology gene network tracking task in three different popular physics simulation platforms. As an empirical evaluation on the advantage of stability, we show that the learned policies can enable the systems to recover to the equilibrium or way-points when interfered by uncertainties such as system parametric variations and external disturbances to a certain extent.Comment: IEEE RA-L + IROS 202

arXiv.org e-Print Archive

UCL Discovery

The University of Manchester - Institutional Repository

Analytic Regularity and GPC Approximation for Control Problems Constrained by Linear Parametric Elliptic and Parabolic PDEs

Author: Kunoth Angela
Schwab Christoph
Publication venue
Publication date: 07/03/2013
Field of study

This paper deals with linear-quadratic optimal control problems constrained by a parametric or stochastic elliptic or parabolic PDE. We address the (difficult) case that the state equation depends on a countable number of parameters i.e., on

\sigma_j

with

j\in\N

, and that the PDE operator may depend non-affinely on the parameters. We consider tracking-type functionals and distributed as well as boundary controls. Building on recent results in [CDS1, CDS2], we show that the state and the control are analytic as functions depending on these parameters

\sigma_j

. We establish sparsity of generalized polynomial chaos (gpc) expansions of both, state and control, in terms of the stochastic coordinate sequence

\sigma = (\sigma_j)_{j\ge 1}

of the random inputs, and prove convergence rates of best

N

-term truncations of these expansions. Such truncations are the key for subsequent computations since they do {\em not} assume that the stochastic input data has a finite expansion. In the follow-up paper [KS2], we explain two methods how such best

N

-term truncations can practically be computed, by greedy-type algorithms as in [SG, Gi1], or by multilevel Monte-Carlo methods as in [KSS]. The sparsity result allows in conjunction with adaptive wavelet Galerkin schemes for sparse, adaptive tensor discretizations of control problems constrained by linear elliptic and parabolic PDEs developed in [DK, GK, K], see [KS2]

ACMAC