770 research outputs found

    Estimation and stability of nonlinear control systems under intermittent information with applications to multi-agent robotics

    Get PDF
    This dissertation investigates the role of intermittent information in estimation and control problems and applies the obtained results to multi-agent tasks in robotics. First, we develop a stochastic hybrid model of mobile networks able to capture a large variety of heterogeneous multi-agent problems and phenomena. This model is applied to a case study where a heterogeneous mobile sensor network cooperatively detects and tracks mobile targets based on intermittent observations. When these observations form a satisfactory target trajectory, a mobile sensor is switched to the pursuit mode and deployed to capture the target. The cost of operating the sensors is determined from the geometric properties of the network, environment and probability of target detection. The above case study is motivated by the Marco Polo game played by children in swimming pools. Second, we develop adaptive sampling of targets positions in order to minimize energy consumption, while satisfying performance guarantees such as increased probability of detection over time, and no-escape conditions. A parsimonious predictor-corrector tracking filter, that uses geometrical properties of targets\u27 tracks to estimate their positions using imperfect and intermittent measurements, is presented. It is shown that this filter requires substantially less information and processing power than the Unscented Kalman Filter and Sampling Importance Resampling Particle Filter, while providing comparable estimation performance in the presence of intermittent information. Third, we investigate stability of nonlinear control systems under intermittent information. We replace the traditional periodic paradigm, where the up-to-date information is transmitted and control laws are executed in a periodic fashion, with the event-triggered paradigm. Building on the small gain theorem, we develop input-output triggered control algorithms yielding stable closed-loop systems. In other words, based on the currently available (but outdated) measurements of the outputs and external inputs of a plant, a mechanism triggering when to obtain new measurements and update the control inputs is provided. Depending on the noise environment, the developed algorithm yields stable, asymptotically stable, and Lp-stable (with bias) closed-loop systems. Control loops are modeled as interconnections of hybrid systems for which novel results on Lp-stability are presented. Prediction of a triggering event is achieved by employing Lp-gains over a finite horizon in the small gain theorem. By resorting to convex programming, a method to compute Lp-gains over a finite horizon is devised. Next, we investigate optimal intermittent feedback for nonlinear control systems. Using the currently available measurements from a plant, we develop a methodology that outputs when to update the control law with new measurements such that a given cost function is minimized. Our cost function captures trade-offs between the performance and energy consumption of the control system. The optimization problem is formulated as a Dynamic Programming problem, and Approximate Dynamic Programming is employed to solve it. Instead of advocating a particular approximation architecture for Approximate Dynamic Programming, we formulate properties that successful approximation architectures satisfy. In addition, we consider problems with partially observable states, and propose Particle Filtering to deal with partially observable states and intermittent feedback. Finally, we investigate a decentralized output synchronization problem of heterogeneous linear systems. We develop a self-triggered output broadcasting policy for the interconnected systems. Broadcasting time instants adapt to the current communication topology. For a fixed topology, our broadcasting policy yields global exponential output synchronization, and Lp-stable output synchronization in the presence of disturbances. Employing a converse Lyapunov theorem for impulsive systems, we provide an average dwell time condition that yields disturbance-to-state stable output synchronization in case of switching topology. Our approach is applicable to directed and unbalanced communication topologies.\u2

    Event-triggered robust control for multi-player nonzero-sum games with input constraints and mismatched uncertainties

    Get PDF
    In this article, an event-triggered robust control (ETRC) method is investigated for multi-player nonzero-sum games of continuous-time input constrained nonlinear systems with mismatched uncertainties. By constructing an auxiliary system and designing an appropriate value function, the robust control problem of input constrained nonlinear systems is transformed into an optimal regulation problem. Then, a critic neural network (NN) is adopted to approximate the value function of each player for solving the event-triggered coupled Hamilton-Jacobi equation and obtaining control laws. Based on a designed event-triggering condition, control laws are updated when events occur only. Thus, both computational burden and communication bandwidth are reduced. We prove that the weight approximation errors of critic NNs and the closed-loop uncertain multi-player system states are all uniformly ultimately bounded thanks to the Lyapunov's direct method. Finally, two examples are provided to demonstrate the effectiveness of the developed ETRC method

    Machine Learning

    Get PDF
    Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience

    Kernel-based approximate dynamic programming using Bellman residual elimination

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2010.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 207-221).Many sequential decision-making problems related to multi-agent robotic systems can be naturally posed as Markov Decision Processes (MDPs). An important advantage of the MDP framework is the ability to utilize stochastic system models, thereby allowing the system to make sound decisions even if there is randomness in the system evolution over time. Unfortunately, the curse of dimensionality prevents most MDPs of practical size from being solved exactly. One main focus of the thesis is on the development of a new family of algorithms for computing approximate solutions to large-scale MDPs. Our algorithms are similar in spirit to Bellman residual methods, which attempt to minimize the error incurred in solving Bellman's equation at a set of sample states. However, by exploiting kernel-based regression techniques (such as support vector regression and Gaussian process regression) with nondegenerate kernel functions as the underlying cost-to-go function approximation architecture, our algorithms are able to construct cost-to-go solutions for which the Bellman residuals are explicitly forced to zero at the sample states. For this reason, we have named our approach Bellman residual elimination (BRE). In addition to developing the basic ideas behind BRE, we present multi-stage and model-free extensions to the approach. The multistage extension allows for automatic selection of an appropriate kernel for the MDP at hand, while the model-free extension can use simulated or real state trajectory data to learn an approximate policy when a system model is unavailable.(cont.) We present theoretical analysis of all BRE algorithms proving convergence to the optimal policy in the limit of sampling the entire state space, and show computational results on several benchmark problems. Another challenge in implementing control policies based on MDPs is that there may be parameters of the system model that are poorly known and/or vary with time as the system operates. System performance can suer if the model used to compute the policy differs from the true model. To address this challenge, we develop an adaptive architecture that allows for online MDP model learning and simultaneous re-computation of the policy. As a result, the adaptive architecture allows the system to continuously re-tune its control policy to account for better model information 3 obtained through observations of the actual system in operation, and react to changes in the model as they occur. Planning in complex, large-scale multi-agent robotic systems is another focus of the thesis. In particular, we investigate the persistent surveillance problem, in which one or more unmanned aerial vehicles (UAVs) and/or unmanned ground vehicles (UGVs) must provide sensor coverage over a designated location on a continuous basis. This continuous coverage must be maintained even in the event that agents suer failures over the course of the mission. The persistent surveillance problem is pertinent to a number of applications, including search and rescue, natural disaster relief operations, urban traffic monitoring, etc.(cont.) Using both simulations and actual flight experiments conducted in the MIT RAVEN indoor flight facility, we demonstrate the successful application of the BRE algorithms and the adaptive MDP architecture in achieving high mission performance despite the random occurrence of failures. Furthermore, we demonstrate performance benefits of our approach over a deterministic planning approach that does not account for these failures.by Brett M. Bethke.Ph.D

    Adaptive dynamic programming with eligibility traces and complexity reduction of high-dimensional systems

    Get PDF
    This dissertation investigates the application of a variety of computational intelligence techniques, particularly clustering and adaptive dynamic programming (ADP) designs especially heuristic dynamic programming (HDP) and dual heuristic programming (DHP). Moreover, a one-step temporal-difference (TD(0)) and n-step TD (TD(λ)) with their gradients are utilized as learning algorithms to train and online-adapt the families of ADP. The dissertation is organized into seven papers. The first paper demonstrates the robustness of model order reduction (MOR) for simulating complex dynamical systems. Agglomerative hierarchical clustering based on performance evaluation is introduced for MOR. This method computes the reduced order denominator of the transfer function by clustering system poles in a hierarchical dendrogram. Several numerical examples of reducing techniques are taken from the literature to compare with our work. In the second paper, a HDP is combined with the Dyna algorithm for path planning. The third paper uses DHP with an eligibility trace parameter (λ) to track a reference trajectory under uncertainties for a nonholonomic mobile robot by using a first-order Sugeno fuzzy neural network structure for the critic and actor networks. In the fourth and fifth papers, a stability analysis for a model-free action-dependent HDP(λ) is demonstrated with batch- and online-implementation learning, respectively. The sixth work combines two different gradient prediction levels of critic networks. In this work, we provide a convergence proofs. The seventh paper develops a two-hybrid recurrent fuzzy neural network structures for both critic and actor networks. They use a novel n-step gradient temporal-difference (gradient of TD(λ)) of an advanced ADP algorithm called value-gradient learning (VGL(λ)), and convergence proofs are given. Furthermore, the seventh paper is the first to combine the single network adaptive critic with VGL(λ). --Abstract, page iv

    Decision uncertainty minimization and autonomous information gathering

    Get PDF
    Thesis: Ph. D., Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (pages 272-283).Over the past several decades, technologies for remote sensing and exploration have become increasingly powerful but continue to face limitations in the areas of information gathering and analysis. These limitations affect technologies that use autonomous agents, which are devices that can make routine decisions independent of operator instructions. Bandwidth and other communications limitation require that autonomous differentiate between relevant and irrelevant information in a computationally efficient manner. This thesis presents a novel approach to this problem by framing it as an adaptive sensing problem. Adaptive sensing allows agents to modify their information collection strategies in response to the information gathered in real time. We developed and tested optimization algorithms that apply information guides to Monte Carlo planners. Information guides provide a mechanism by which the algorithms may blend online (realtime) and offline (previously simulated) planning in order to incorporate uncertainty into the decisionmaking process. This greatly reduces computational operations as well as decisional and communications overhead. We begin by introducing a 3-level hierarchy that visualizes adaptive sensing at synoptic (global), mesocale (intermediate) and microscale (close-up) levels (a spatial hierarchy). We then introduce new algorithms for decision uncertainty minimization (DUM) and representational uncertainty minimization (RUM). Finally, we demonstrate the utility of this approach to real-world sensing problems, including bathymetric mapping and disaster relief. We also examine its potential in space exploration tasks by describing its use in a hypothetical aerial exploration of Mars. Our ultimate goal is to facilitate future large-scale missions to extraterrestrial objects for the purposes of scientific advancement and human exploration.by Lawrence A. M. Bush.Ph. D
    corecore