807 research outputs found

    Optimal Control and Coordination of Small UAVs for Vision-based Target Tracking

    Get PDF
    Small unmanned aerial vehicles (UAVs) are relatively inexpensive mobile sensing platforms capable of reliably and autonomously performing numerous tasks, including mapping, search and rescue, surveillance and tracking, and real-time monitoring. The general problem of interest that we address is that of using small, fixed-wing UAVs to perform vision-based target tracking, which entails that one or more camera-equipped UAVs is responsible for autonomously tracking a moving ground target. In the single-UAV setting, the underactuated UAV must maintain proximity and visibility of an unpredictable ground target while having a limited sensing region. We provide solutions from two different vantage points. The first regards the problem as a two-player zero-sum game and the second as a stochastic optimal control problem. The resulting control policies have been successfully field-tested, thereby verifying the efficacy of both approaches while highlighting the advantages of one approach over the other. When employing two UAVs, one can fuse vision-based measurements to improve the estimate of the target's position. Accordingly, the second part of this dissertation involves determining the optimal control policy for two UAVs to gather the best joint vision-based measurements of a moving ground target, which is first done in a simplified deterministic setting. The results in this setting show that the key optimal control strategy is the coordination of the UAVs' distances to the target and not of the viewing angles as is traditionally assumed, thereby showing the advantage of solving the optimal control problem over using heuristics. To generate a control policy robust to real-world conditions, we formulate the same control objective using higher order stochastic kinematic models. Since grid-based solutions are infeasible for a stochastic optimal control problem of this dimension, we employ a simulation-based dynamic programming technique that relies on regression to form the optimal policy maps, thereby demonstrating an effective solution to a multi-vehicle coordination problem that until recently seemed intractable on account of its dimension. The results show that distance coordination is again the key optimal control strategy and that the policy offers considerable advantages over uncoordinated optimal policies, namely reduced variability in the cost and a reduction in the severity and frequency of high-cost events

    Adaptive and learning-based formation control of swarm robots

    Get PDF
    Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation

    Emergency rapid mapping with drones: models and solution approaches for offline and online mission planning

    Get PDF
    Die Verfügbarkeit von unbemannten Luftfahrzeugen (unmanned aerial vehicles oder UAVs) und die Fortschritte in der Entwicklung leichtgewichtiger Sensorik eröffnen neue Möglichkeiten für den Einsatz von Fernerkundungstechnologien zur Schnellerkundung in Großschadenslagen. Hier ermöglichen sie es beispielsweise nach Großbränden, Einsatzkräften in kurzer Zeit ein erstes Lagebild zur Verfügung zu stellen. Die begrenzte Flugdauer der UAVs wie auch der Bedarf der Einsatzkräfte nach einer schnellen Ersteinschätzung bedeuten jedoch, dass die betroffenen Gebiete nur stichprobenartig überprüft werden können. In Kombination mit Interpolationsverfahren ermöglichen diese Stichproben anschließend eine Abschätzung der Verteilung von Gefahrstoffen. Die vorliegende Arbeit befasst sich mit dem Problem der Planung von UAV-Missionen, die den Informationsgewinn im Notfalleinsatz maximieren. Das Problem wird dabei sowohl in der Offline-Variante, die Missionen vor Abflug bestimmt, als auch in der Online-Variante, bei der die Pläne während des Fluges der UAVs aktualisiert werden, untersucht. Das übergreifende Ziel ist die Konzeption effizienter Modelle und Verfahren, die Informationen über die räumliche Korrelation im beobachteten Gebiet nutzen, um in zeitkritischen Situationen Lösungen von hoher Vorhersagegüte zu bestimmen. In der Offline-Planung wird das generalized correlated team orienteering problem eingeführt und eine zweistufige Heuristik zur schnellen Bestimmung explorativer UAV-Missionen vorgeschlagen. In einer umfangreichen Studie wird die Leistungsfähigkeit und Konkurrenzfähigkeit der Heuristik hinsichtlich Rechenzeit und Lösungsqualität bestätigt. Anhand von in dieser Arbeit neu eingeführten Benchmarkinstanzen wird der höhere Informationsgewinn der vorgeschlagenen Modelle im Vergleich zu verwandten Konzepten aufgezeigt. Im Bereich der Online-Planung wird die Kombination von lernenden Verfahren zur Modellierung der Schadstoffe mit Planungsverfahren, die dieses Wissen nutzen, um Missionen zu verbessern, untersucht. Hierzu wird eine breite Spanne von Lösungsverfahren aus unterschiedlichen Disziplinen klassifiziert und um neue effiziente Modellierungsvarianten für die Schnellerkundung ergänzt. Die Untersuchung im Rahmen einer ereignisdiskreten Simulation zeigt, dass vergleichsweise einfache Approximationen räumlicher Zusammenhänge in sehr kurzer Zeit Lösungen hoher Qualität ermöglichen. Darüber hinaus wird die höhere Robustheit genauerer, aber aufwändigerer Modelle und Lösungskonzepte demonstriert

    Kernel-based approximate dynamic programming using Bellman residual elimination

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2010.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 207-221).Many sequential decision-making problems related to multi-agent robotic systems can be naturally posed as Markov Decision Processes (MDPs). An important advantage of the MDP framework is the ability to utilize stochastic system models, thereby allowing the system to make sound decisions even if there is randomness in the system evolution over time. Unfortunately, the curse of dimensionality prevents most MDPs of practical size from being solved exactly. One main focus of the thesis is on the development of a new family of algorithms for computing approximate solutions to large-scale MDPs. Our algorithms are similar in spirit to Bellman residual methods, which attempt to minimize the error incurred in solving Bellman's equation at a set of sample states. However, by exploiting kernel-based regression techniques (such as support vector regression and Gaussian process regression) with nondegenerate kernel functions as the underlying cost-to-go function approximation architecture, our algorithms are able to construct cost-to-go solutions for which the Bellman residuals are explicitly forced to zero at the sample states. For this reason, we have named our approach Bellman residual elimination (BRE). In addition to developing the basic ideas behind BRE, we present multi-stage and model-free extensions to the approach. The multistage extension allows for automatic selection of an appropriate kernel for the MDP at hand, while the model-free extension can use simulated or real state trajectory data to learn an approximate policy when a system model is unavailable.(cont.) We present theoretical analysis of all BRE algorithms proving convergence to the optimal policy in the limit of sampling the entire state space, and show computational results on several benchmark problems. Another challenge in implementing control policies based on MDPs is that there may be parameters of the system model that are poorly known and/or vary with time as the system operates. System performance can suer if the model used to compute the policy differs from the true model. To address this challenge, we develop an adaptive architecture that allows for online MDP model learning and simultaneous re-computation of the policy. As a result, the adaptive architecture allows the system to continuously re-tune its control policy to account for better model information 3 obtained through observations of the actual system in operation, and react to changes in the model as they occur. Planning in complex, large-scale multi-agent robotic systems is another focus of the thesis. In particular, we investigate the persistent surveillance problem, in which one or more unmanned aerial vehicles (UAVs) and/or unmanned ground vehicles (UGVs) must provide sensor coverage over a designated location on a continuous basis. This continuous coverage must be maintained even in the event that agents suer failures over the course of the mission. The persistent surveillance problem is pertinent to a number of applications, including search and rescue, natural disaster relief operations, urban traffic monitoring, etc.(cont.) Using both simulations and actual flight experiments conducted in the MIT RAVEN indoor flight facility, we demonstrate the successful application of the BRE algorithms and the adaptive MDP architecture in achieving high mission performance despite the random occurrence of failures. Furthermore, we demonstrate performance benefits of our approach over a deterministic planning approach that does not account for these failures.by Brett M. Bethke.Ph.D

    Estimation and stability of nonlinear control systems under intermittent information with applications to multi-agent robotics

    Get PDF
    This dissertation investigates the role of intermittent information in estimation and control problems and applies the obtained results to multi-agent tasks in robotics. First, we develop a stochastic hybrid model of mobile networks able to capture a large variety of heterogeneous multi-agent problems and phenomena. This model is applied to a case study where a heterogeneous mobile sensor network cooperatively detects and tracks mobile targets based on intermittent observations. When these observations form a satisfactory target trajectory, a mobile sensor is switched to the pursuit mode and deployed to capture the target. The cost of operating the sensors is determined from the geometric properties of the network, environment and probability of target detection. The above case study is motivated by the Marco Polo game played by children in swimming pools. Second, we develop adaptive sampling of targets positions in order to minimize energy consumption, while satisfying performance guarantees such as increased probability of detection over time, and no-escape conditions. A parsimonious predictor-corrector tracking filter, that uses geometrical properties of targets\u27 tracks to estimate their positions using imperfect and intermittent measurements, is presented. It is shown that this filter requires substantially less information and processing power than the Unscented Kalman Filter and Sampling Importance Resampling Particle Filter, while providing comparable estimation performance in the presence of intermittent information. Third, we investigate stability of nonlinear control systems under intermittent information. We replace the traditional periodic paradigm, where the up-to-date information is transmitted and control laws are executed in a periodic fashion, with the event-triggered paradigm. Building on the small gain theorem, we develop input-output triggered control algorithms yielding stable closed-loop systems. In other words, based on the currently available (but outdated) measurements of the outputs and external inputs of a plant, a mechanism triggering when to obtain new measurements and update the control inputs is provided. Depending on the noise environment, the developed algorithm yields stable, asymptotically stable, and Lp-stable (with bias) closed-loop systems. Control loops are modeled as interconnections of hybrid systems for which novel results on Lp-stability are presented. Prediction of a triggering event is achieved by employing Lp-gains over a finite horizon in the small gain theorem. By resorting to convex programming, a method to compute Lp-gains over a finite horizon is devised. Next, we investigate optimal intermittent feedback for nonlinear control systems. Using the currently available measurements from a plant, we develop a methodology that outputs when to update the control law with new measurements such that a given cost function is minimized. Our cost function captures trade-offs between the performance and energy consumption of the control system. The optimization problem is formulated as a Dynamic Programming problem, and Approximate Dynamic Programming is employed to solve it. Instead of advocating a particular approximation architecture for Approximate Dynamic Programming, we formulate properties that successful approximation architectures satisfy. In addition, we consider problems with partially observable states, and propose Particle Filtering to deal with partially observable states and intermittent feedback. Finally, we investigate a decentralized output synchronization problem of heterogeneous linear systems. We develop a self-triggered output broadcasting policy for the interconnected systems. Broadcasting time instants adapt to the current communication topology. For a fixed topology, our broadcasting policy yields global exponential output synchronization, and Lp-stable output synchronization in the presence of disturbances. Employing a converse Lyapunov theorem for impulsive systems, we provide an average dwell time condition that yields disturbance-to-state stable output synchronization in case of switching topology. Our approach is applicable to directed and unbalanced communication topologies.\u2

    Stochastic Real-time Optimal Control for Bearing-only Trajectory Planning

    Get PDF
    A method is presented to simultaneously solve the optimal control problem and the optimal estimation problem for a bearing-only sensor. For bearing-only systems that require a minimum level of certainty in position relative to a source for mission accomplishment, some amount of maneuver is required to measure range. Traditional methods of trajectory optimization and optimal estimation minimize an information metric. This paper proposes constraining the final value of the information states with known time propagation dynamics relative to a given trajectory which allows for attainment of the required level of information with minimal deviation from a general performance index that can be tailored to a specific vehicle. The proposed method does not suffer from compression of the information metric into a scalar, and provides a route that will attain a particular target estimate quality while maneuvering to a desired relative point or set. An algorithm is created to apply the method in real-time, iteratively estimating target position with an Unscented Kalman Filter and updating the trajectory with an efficient pseudospectral method. Methods and tools required for hardware implementation are presented that apply to any real-time optimal control (RTOC) system. The algorithm is validated with both simulation and flight test, autonomously landing a quadrotor on a wire

    Evolution of Control Programs for a Swarm of Autonomous Unmanned Aerial Vehicles

    Get PDF
    Unmanned aerial vehicles (UAVs) are rapidly becoming a critical military asset. In the future, advances in miniaturization are going to drive the development of insect size UAVs. New approaches to controlling these swarms are required. The goal of this research is to develop a controller to direct a swarm of UAVs in accomplishing a given mission. While previous efforts have largely been limited to a two-dimensional model, a three-dimensional model has been developed for this project. Models of UAV capabilities including sensors, actuators and communications are presented. Genetic programming uses the principles of Darwinian evolution to generate computer programs to solve problems. A genetic programming approach is used to evolve control programs for UAV swarms. Evolved controllers are compared with a hand-crafted solution using quantitative and qualitative methods. Visualization and statistical methods are used to analyze solutions. Results indicate that genetic programming is capable of producing effective solutions to multi-objective control problems
    corecore