9,534 research outputs found

    A review on analysis and synthesis of nonlinear stochastic systems with randomly occurring incomplete information

    Get PDF
    Copyright q 2012 Hongli Dong et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.In the context of systems and control, incomplete information refers to a dynamical system in which knowledge about the system states is limited due to the difficulties in modeling complexity in a quantitative way. The well-known types of incomplete information include parameter uncertainties and norm-bounded nonlinearities. Recently, in response to the development of network technologies, the phenomenon of randomly occurring incomplete information has become more and more prevalent. Such a phenomenon typically appears in a networked environment. Examples include, but are not limited to, randomly occurring uncertainties, randomly occurring nonlinearities, randomly occurring saturation, randomly missing measurements and randomly occurring quantization. Randomly occurring incomplete information, if not properly handled, would seriously deteriorate the performance of a control system. In this paper, we aim to survey some recent advances on the analysis and synthesis problems for nonlinear stochastic systems with randomly occurring incomplete information. The developments of the filtering, control and fault detection problems are systematically reviewed. Latest results on analysis and synthesis of nonlinear stochastic systems are discussed in great detail. In addition, various distributed filtering technologies over sensor networks are highlighted. Finally, some concluding remarks are given and some possible future research directions are pointed out. © 2012 Hongli Dong et al.This work was supported in part by the National Natural Science Foundation of China under Grants 61273156, 61134009, 61273201, 61021002, and 61004067, the Engineering and Physical Sciences Research Council (EPSRC) of the UK under Grant GR/S27658/01, the Royal Society of the UK, the National Science Foundation of the USA under Grant No. HRD-1137732, and the Alexander von Humboldt Foundation of German

    Agile Autonomous Driving using End-to-End Deep Imitation Learning

    Full text link
    We present an end-to-end imitation learning system for agile, off-road autonomous driving using only low-cost sensors. By imitating a model predictive controller equipped with advanced sensors, we train a deep neural network control policy to map raw, high-dimensional observations to continuous steering and throttle commands. Compared with recent approaches to similar tasks, our method requires neither state estimation nor on-the-fly planning to navigate the vehicle. Our approach relies on, and experimentally validates, recent imitation learning theory. Empirically, we show that policies trained with online imitation learning overcome well-known challenges related to covariate shift and generalize better than policies trained with batch imitation learning. Built on these insights, our autonomous driving system demonstrates successful high-speed off-road driving, matching the state-of-the-art performance.Comment: 13 pages, Robotics: Science and Systems (RSS) 201

    Optimal control of nonlinear partially-unknown systems with unsymmetrical input constraints and its applications to the optimal UAV circumnavigation problem

    Full text link
    Aimed at solving the optimal control problem for nonlinear systems with unsymmetrical input constraints, we present an online adaptive approach for partially unknown control systems/dynamics. The designed algorithm converges online to the optimal control solution without the knowledge of the internal system dynamics. The optimality of the obtained control policy and the stability for the closed-loop dynamic optimality are proved theoretically. The proposed method greatly relaxes the assumption on the form of the internal dynamics and input constraints in previous works. Besides, the control design framework proposed in this paper offers a new approach to solve the optimal circumnavigation problem involving a moving target for a fixed-wing unmanned aerial vehicle (UAV). The control performance of our method is compared with that of the existing circumnavigation control law in a numerical simulation and the simulation results validate the effectiveness of our algorithm

    Certified Reinforcement Learning with Logic Guidance

    Full text link
    This paper proposes the first model-free Reinforcement Learning (RL) framework to synthesise policies for unknown, and continuous-state Markov Decision Processes (MDPs), such that a given linear temporal property is satisfied. We convert the given property into a Limit Deterministic Buchi Automaton (LDBA), namely a finite-state machine expressing the property. Exploiting the structure of the LDBA, we shape a synchronous reward function on-the-fly, so that an RL algorithm can synthesise a policy resulting in traces that probabilistically satisfy the linear temporal property. This probability (certificate) is also calculated in parallel with policy learning when the state space of the MDP is finite: as such, the RL algorithm produces a policy that is certified with respect to the property. Under the assumption of finite state space, theoretical guarantees are provided on the convergence of the RL algorithm to an optimal policy, maximising the above probability. We also show that our method produces ''best available'' control policies when the logical property cannot be satisfied. In the general case of a continuous state space, we propose a neural network architecture for RL and we empirically show that the algorithm finds satisfying policies, if there exist such policies. The performance of the proposed framework is evaluated via a set of numerical examples and benchmarks, where we observe an improvement of one order of magnitude in the number of iterations required for the policy synthesis, compared to existing approaches whenever available.Comment: This article draws from arXiv:1801.08099, arXiv:1809.0782
    corecore