131 research outputs found

    Deep Drone Acrobatics

    Full text link
    Performing acrobatic maneuvers with quadrotors is extremely challenging. Acrobatic flight requires high thrust and extreme angular accelerations that push the platform to its physical limits. Professional drone pilots often measure their level of mastery by flying such maneuvers in competitions. In this paper, we propose to learn a sensorimotor policy that enables an autonomous quadrotor to fly extreme acrobatic maneuvers with only onboard sensing and computation. We train the policy entirely in simulation by leveraging demonstrations from an optimal controller that has access to privileged information. We use appropriate abstractions of the visual input to enable transfer to a real quadrotor. We show that the resulting policy can be directly deployed in the physical world without any fine-tuning on real data. Our methodology has several favorable properties: it does not require a human expert to provide demonstrations, it cannot harm the physical system during training, and it can be used to learn maneuvers that are challenging even for the best human pilots. Our approach enables a physical quadrotor to fly maneuvers such as the Power Loop, the Barrel Roll, and the Matty Flip, during which it incurs accelerations of up to 3g.Comment: 8 pages + 2 pages references. Video: https://youtu.be/2N_wKXQ6MXA. Code: https://github.com/uzh-rpg/deep_drone_acrobatic

    Model Predictive Control for Micro Aerial Vehicles: A Survey

    Full text link
    This paper presents a review of the design and application of model predictive control strategies for Micro Aerial Vehicles and specifically multirotor configurations such as quadrotors. The diverse set of works in the domain is organized based on the control law being optimized over linear or nonlinear dynamics, the integration of state and input constraints, possible fault-tolerant design, if reinforcement learning methods have been utilized and if the controller refers to free-flight or other tasks such as physical interaction or load transportation. A selected set of comparison results are also presented and serve to provide insight for the selection between linear and nonlinear schemes, the tuning of the prediction horizon, the importance of disturbance observer-based offset-free tracking and the intrinsic robustness of such methods to parameter uncertainty. Furthermore, an overview of recent research trends on the combined application of modern deep reinforcement learning techniques and model predictive control for multirotor vehicles is presented. Finally, this review concludes with explicit discussion regarding selected open-source software packages that deliver off-the-shelf model predictive control functionality applicable to a wide variety of Micro Aerial Vehicle configurations

    Adapt-to-learn policy transfer in reinforcement learning and deep model reference adaptive control

    Get PDF
    Adaptation and Learning from exploration have been a key in biological learning; Humans and animals do not learn every task in isolation; rather are able to quickly adapt the learned behaviors between similar tasks and learn new skills when presented with new situations. Inspired by this, adaptation has been an important direction of research in control as Adaptive Controllers. However, the Adaptive Controllers like Model Reference Adaptive Controller are mainly model-based controllers and do not rely on exploration instead make informed decisions exploiting the model's structure. Therefore such controllers are characterized by high sample efficiency and stability conditions and, therefore, suitable for safety-critical systems. On the other hand, we have Learning-based optimal control algorithms like Reinforcement Learning. Reinforcement learning is a trial and error method, where an agent explores the environment by taking random action and maximizing the likelihood of those particular actions that result in a higher return. However, these exploration techniques are expected to fail many times before exploring optimal policy. Therefore, they are highly sample-expensive and lack stability guarantees and hence not suitable for safety-critical systems. This thesis presents control algorithms for robotics where the best of both worlds that is ``Adaptation'' and ``Learning from exploration'' are brought together to propose new algorithms that can perform better than their conventional counterparts. In this effort, we first present an Adapt to learn policy transfer Algorithm, where we use control theoretical ideas of adaptation to transfer policy between two related but different tasks using the policy gradient method of reinforcement learning. Efficient and robust policy transfer remains a key challenge in reinforcement learning. Policy transfer through warm initialization, imitation, or interacting over a large set of agents with randomized instances, have been commonly applied to solve a variety of Reinforcement Learning (RL) tasks. However, this is far from how behavior transfer happens in the biological world: Here, we seek to answer the question: Will learning to combine adaptation reward with environmental reward lead to a more efficient transfer of policies between domains? We introduce a principled mechanism that can ``Adapt-to-Learn", which is adapt the source policy to learn to solve a target task with significant transition differences and uncertainties. Through theory and experiments, we show that our method leads to a significantly reduced sample complexity of transferring the policies between the tasks. In the second part of this thesis, information-enabled learning-based adaptive controllers like ``Gaussian Process adaptive controller using Model Reference Generative Network'' (GP-MRGeN), ``Deep Model Reference Adaptive Controller'' (DMRAC) are presented. Model reference adaptive control (MRAC) is a widely studied adaptive control methodology that aims to ensure that a nonlinear plant with significant model uncertainty behaves like a chosen reference model. MRAC methods try to adapt the system to changes by representing the system uncertainties as weighted combinations of known nonlinear functions and using weight update law that ensures that network weights are moved in the direction of minimizing the instantaneous tracking error. However, most MRAC adaptive controllers use a shallow network and only the instantaneous data for adaptation, restricting their representation capability and limiting their performance under fast-changing uncertainties and faults in the system. In this thesis, we propose a Gaussian process based adaptive controller called GP-MRGeN. We present a new approach to the online supervised training of GP models using a new architecture termed as Model Reference Generative Network (MRGeN). Our architecture is very loosely inspired by the recent success of generative neural network models. Nevertheless, our contributions ensure that the inclusion of such a model in closed-loop control does not affect the stability properties. The GP-MRGeN controller, through using a generative network, is capable of achieving higher adaptation rates without losing robustness properties of the controller, hence suitable for mitigating faults in fast-evolving systems. Further, in this thesis, we present a new neuroadaptive architecture: Deep Neural Network-based Model Reference Adaptive Control. This architecture utilizes deep neural network representations for modeling significant nonlinearities while marrying it with the boundedness guarantees that characterize MRAC based controllers. We demonstrate through simulations and analysis that DMRAC can subsume previously studied learning-based MRAC methods, such as concurrent learning and GP-MRAC. This makes DMRAC a highly powerful architecture for high-performance control of nonlinear systems with long-term learning properties. Theoretical proofs of the controller generalizing capability over unseen data points and boundedness properties of the tracking error are also presented. Experiments with the quadrotor vehicle demonstrate the controller performance in achieving reference model tracking in the presence of significant matched uncertainties. A software+communication architecture is designed to ensure online real-time inference of the deep network on a high-bandwidth computation-limited platform to achieve these results. These results demonstrate the efficacy of deep networks for high bandwidth closed-loop attitude control of unstable and nonlinear robots operating in adverse situations. We expect that this work will benefit other closed-loop deep-learning control architectures for robotics

    Perception-driven optimal motion planning under resource constraints

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Applied Ocean Science & Engineering at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution February 2019.Over the past few years there has been a new wave of interest in fully autonomous robots operating in the real world, with applications from autonomous driving to search and rescue. These robots are expected to operate at high speeds in unknown, unstructured environments using only onboard sensing and computation, presenting significant challenges for high performance autonomous navigation. To enable research in these challenging scenarios, the first part of this thesis focuses on the development of a custom high-performance research UAV capable of high speed autonomous flight using only vision and inertial sensors. This research platform was used to develop stateof-the-art onboard visual inertial state estimation at high speeds in challenging scenarios such as flying through window gaps. While this platform is capable of high performance state estimation and control, its capabilities in unknown environments are severely limited by the computational costs of running traditional vision-based mapping and motion planning algorithms on an embedded platform. Motivated by these challenges, the second part of this thesis presents an algorithmic approach to the problem of motion planning in an unknown environment when the computational costs of mapping all available sensor data is prohibitively high. The algorithm is built around a tree of dynamically feasible and free space optimal trajectories to the goal state in configuration space. As the algorithm progresses it iteratively switches between processing new sensor data and locally updating the search tree. We show that the algorithm produces globally optimal motion plans, matching the optimal solution for the case with the full (unprocessed) sensor data, while only processing a subset of the data. The mapping and motion planning algorithm is demonstrated on a number of test systems, with a particular focus on a six-dimensional thrust limited model of a quadrotor

    Safe planning and control via L1-adaptation and contraction theory

    Get PDF
    Autonomous robots that are capable of operating safely in the presence of imperfect model knowledge or external disturbances are vital in safety-critical applications. The research presented in this dissertation aims to enable safe planning and control for nonlinear systems with uncertainties using robust adaptive control theory. To this end we develop methods that (i) certify the collision-risk for the planned trajectories of autonomous robots, (ii) ensure guaranteed tracking performance in the presence of uncertainties, and (iii) learn the uncertainties in the model without sacrificing the transient performance guarantees, and (iv) learn incremental stability certificates parameterized as neural networks. In motion planning problems for autonomous robots, such as self-driving cars, the robot must ensure that its planned path is not in close proximity to obstacles in the environment. However, the problem of evaluating the proximity is generally non-convex and serves as a significant computational bottleneck for motion planning algorithms. In this work, we present methods for a general class of absolutely continuous parametric curves to compute: the minimum separating distance, tolerance verification, and collision detection with respect to obstacles in the environment. A planning algorithm is incomplete if the robot is unable to safely track the planned trajectory. We introduce a feedback motion planning approach using contraction theory-based L1-adaptive (CL1) control to certify that planned trajectories of nonlinear systems with matched uncertainties are tracked with desired performance requirements. We present a planner-agnostic framework to design and certify invariant tubes around planned trajectories that the robot is always guaranteed to remain inside. By leveraging recent results in contraction analysis and L1-adaptive control we present an architecture that induces invariant tubes for nonlinear systems with state and time-varying uncertainties. Uncertainties caused by large modeling errors will significantly hinder the performance of any autonomous system. We adapt the CL1 framework to safely learn the uncertainties while simultaneously providing high-probability bounds on the tracking behavior. Any available data is incorporated into Gaussian process (GP) models of the uncertainties while the error in the learned model is quantified and handled by the CL1 controller to ensure that control objectives are met safely. As learning improves, so does the overall tracking performance of the system. This way, the safe operation of the system is always guaranteed, even during the learning transients. The tracking performance guarantees for nonlinear systems rely on the existence of incremental stability certificates that are prohibitively difficult to search for. We leverage the function approximation capabilities of deep neural networks for learning the certificates and the associated control policies jointly. The incremental stability properties of the closed-loop system are verified using interval arithmetic. The domain of the system is iteratively refined into a collection of intervals that certify the satisfaction of the stability properties over the interval regions. Thus, we avoid entirely rejecting the learned certificates and control policies just because they violate the stability properties in certain parts of the domain. We provide numerical experimentation on an inverted pendulum to validate our proposed methodology

    A survey on active simultaneous localization and mapping: state of the art and new frontiers

    Get PDF
    Active simultaneous localization and mapping (SLAM) is the problem of planning and controlling the motion of a robot to build the most accurate and complete model of the surrounding environment. Since the first foundational work in active perception appeared, more than three decades ago, this field has received increasing attention across different scientific communities. This has brought about many different approaches and formulations, and makes a review of the current trends necessary and extremely valuable for both new and experienced researchers. In this article, we survey the state of the art in active SLAM and take an in-depth look at the open challenges that still require attention to meet the needs of modern applications. After providing a historical perspective, we present a unified problem formulation and review the well-established modular solution scheme, which decouples the problem into three stages that identify, select, and execute potential navigation actions. We then analyze alternative approaches, including belief-space planning and deep reinforcement learning techniques, and review related work on multirobot coordination. This article concludes with a discussion of new research directions, addressing reproducible research, active spatial perception, and practical applications, among other topics

    Use of Unmanned Aerial Systems in Civil Applications

    Get PDF
    Interest in drones has been exponentially growing in the last ten years and these machines are often presented as the optimal solution in a huge number of civil applications (monitoring, agriculture, emergency management etc). However the promises still do not match the data coming from the consumer market, suggesting that the only big field in which the use of small unmanned aerial vehicles is actually profitable is the video-makers’ one. This may be explained partly with the strong limits imposed by existing (and often "obsolete") national regulations, but also - and pheraps mainly - with the lack of real autonomy. The vast majority of vehicles on the market nowadays are infact autonomous only in the sense that they are able to follow a pre-determined list of latitude-longitude-altitude coordinates. The aim of this thesis is to demonstrate that complete autonomy for UAVs can be achieved only with a performing control, reliable and flexible planning platforms and strong perception capabilities; these topics are introduced and discussed by presenting the results of the main research activities performed by the candidate in the last three years which have resulted in 1) the design, integration and control of a test bed for validating and benchmarking visual-based algorithm for space applications; 2) the implementation of a cloud-based platform for multi-agent mission planning; 3) the on-board use of a multi-sensor fusion framework based on an Extended Kalman Filter architecture
    • …
    corecore