885 research outputs found

    Long-term Informative Path Planning with Autonomous Soaring

    Get PDF
    The ability of UAVs to cover large areas efficiently is valuable for information gathering missions. For long-term information gathering, a UAV may extend its endurance by accessing energy sources present in the atmosphere. Thermals are a favourable source of wind energy and thermal soaring is adopted in this thesis to enable long-term information gathering. This thesis proposes energy-constrained path planning algorithms for a gliding UAV to maximise information gain given a mission time that greatly exceeds the UAV's endurance. This thesis is motivated by the problem of probabilistic target-search performed by an energy-constrained UAV, which is tasked to simultaneously search for a lost ground target and explore for thermals to regain energy. This problem is termed informative soaring (IFS) and combines informative path planning (IPP) with energy constraints. IFS is shown to be NP-hard by showing that it has a similar problem structure to the weight-constrained shortest path problem with replenishments. While an optimal solution may not exist in polynomial time, this thesis proposes path planning algorithms based on informed tree search to find high quality plans with low computational cost. This thesis addresses complex probabilistic belief maps and three primary contributions are presented: • First, IFS is formulated as a graph search problem by observing that any feasible long-term plan must alternate between 1) information gathering between thermals and 2) replenishing energy within thermals. This is a first step to reducing the large search state space. • The second contribution is observing that a complex belief map can be viewed as a collection of information clusters and using a divide and conquer approach, cluster tree search (CTS), to efficiently find high-quality plans in the large search state space. In CTS, near-greedy tree search is used to find locally optimal plans and two global planning versions are proposed to combine local plans into a full plan. Monte Carlo simulation studies show that CTS produces similar plans to variations of exhaustive search, but runs five to 20 times faster. The more computationally efficient version, CTSDP, uses dynamic programming (DP) to optimally combine local plans. CTSDP is executed in real time on board a UAV to demonstrate computational feasibility. • The third contribution is an extension of CTS to unknown drifting thermals. A thermal exploration map is created to detect new thermals that will eventually intercept clusters, and therefore be valuable to the mission. Time windows are computed for known thermals and an optimal cluster visit schedule is formed. A tree search algorithm called CTSDrift combines CTS and thermal exploration. Using 2400 Monte Carlo simulations, CTSDrift is evaluated against a Full Knowledge method that has full knowledge of the thermal field and a Greedy method. On average, CTSDrift outperforms Greedy in one-third of trials, and achieves similar performance to Full Knowledge when environmental conditions are favourable

    Energy-optimal kinodynamic planning for underwater gliders in flow fields

    Full text link
    We consider energy-optimal navigation planning in ow fields, which is a long-standing optimisation problem with no known analytical solution. Using the motivating example of an underwater glider subject to ocean currents, we present an asymptotically optimal planning framework that considers realistic vehicle dynamics and provably returns an optimal solution in the limit. One key idea that we introduce is to reformulate the dynamic control problem as a kinematic problem with trim states, which encapsulate the dynamics over suitably long distances. We report simulation examples that, surprisingly, contravene the use of regular 'sawtooth' paths currently in widespread use. We show that, when internal control mechanics are taken into account, energy-efficient paths do not necessarily follow a regular up-and-down pattern. Our work represents a principled planning framework for underwater gliders that will enable improved navigation capability for both commercial and defence applications

    Learning to soar: exploration strategies in reinforcement learning for resource-constrained missions

    Get PDF
    An unpowered aerial glider learning to soar in a wind field presents a new manifestation of the exploration-exploitation trade-off. This thesis proposes a directed, adaptive and nonmyopic exploration strategy in a temporal difference reinforcement learning framework for tackling the resource-constrained exploration-exploitation task of this autonomous soaring problem. The complete learning algorithm is developed in a SARSA() framework, which uses a Gaussian process with a squared exponential covariance function to approximate the value function. The three key contributions of this thesis form the proposed exploration-exploitation strategy. Firstly, a new information measure is derived from the change in the variance volume surrounding the Gaussian process estimate. This measure of information gain is used to define the exploration reward of an observation. Secondly, a nonmyopic information value is presented that captures both the immediate exploration reward due to taking an action as well as future exploration opportunities that result. Finally, this information value is combined with the state-action value of SARSA() through a dynamic weighting factor to produce an exploration-exploitation management scheme for resource-constrained learning systems. The proposed learning strategy encourages either exploratory or exploitative behaviour depending on the requirements of the learning task and the available resources. The performance of the learning algorithms presented in this thesis is compared against other SARSA() methods. Results show that actively directing exploration to regions of the state-action space with high uncertainty improves the rate of learning, while dynamic management of the exploration-exploitation behaviour according to the available resources produces prudent learning behaviour in resource-constrained systems

    Learning to soar: exploration strategies in reinforcement learning for resource-constrained missions

    Get PDF
    An unpowered aerial glider learning to soar in a wind field presents a new manifestation of the exploration-exploitation trade-off. This thesis proposes a directed, adaptive and nonmyopic exploration strategy in a temporal difference reinforcement learning framework for tackling the resource-constrained exploration-exploitation task of this autonomous soaring problem. The complete learning algorithm is developed in a SARSA() framework, which uses a Gaussian process with a squared exponential covariance function to approximate the value function. The three key contributions of this thesis form the proposed exploration-exploitation strategy. Firstly, a new information measure is derived from the change in the variance volume surrounding the Gaussian process estimate. This measure of information gain is used to define the exploration reward of an observation. Secondly, a nonmyopic information value is presented that captures both the immediate exploration reward due to taking an action as well as future exploration opportunities that result. Finally, this information value is combined with the state-action value of SARSA() through a dynamic weighting factor to produce an exploration-exploitation management scheme for resource-constrained learning systems. The proposed learning strategy encourages either exploratory or exploitative behaviour depending on the requirements of the learning task and the available resources. The performance of the learning algorithms presented in this thesis is compared against other SARSA() methods. Results show that actively directing exploration to regions of the state-action space with high uncertainty improves the rate of learning, while dynamic management of the exploration-exploitation behaviour according to the available resources produces prudent learning behaviour in resource-constrained systems

    Provably-Correct Task Planning for Autonomous Outdoor Robots

    Get PDF
    Autonomous outdoor robots should be able to accomplish complex tasks safely and reliably while considering constraints that arise from both the environment and the physical platform. Such tasks extend basic navigation capabilities to specify a sequence of events over time. For example, an autonomous aerial vehicle can be given a surveillance task with contingency plans while complying with rules in regulated airspace, or an autonomous ground robot may need to guarantee a given probability of success while searching for the quickest way to complete the mission. A promising approach for the automatic synthesis of trusted controllers for complex tasks is to employ techniques from formal methods. In formal methods, tasks are formally specified symbolically with temporal logic. The robot then synthesises a controller automatically to execute trusted behaviour that guarantees the satisfaction of specified tasks and regulations. However, a difficulty arises from the lack of expressivity, which means the constraints affecting outdoor robots cannot be specified naturally with temporal logic. The goal of this thesis is to extend the capabilities of formal methods to express the constraints that arise from outdoor applications and synthesise provably-correct controllers with trusted behaviours over time. This thesis focuses on two important types of constraints, resource and safety constraints, and presents three novel algorithms that express tasks with these constraints and synthesise controllers that satisfy the specification. Firstly, this thesis proposes an extension to probabilistic computation tree logic (PCTL) called resource threshold PCTL (RT-PCTL) that naturally defines the mission specification with continuous resource threshold constraints; furthermore, it synthesises an optimal control policy with respect to the probability of success. With RT-PCTL, a state with accumulated resource out of the specified bound is considered to be failed or saturated depending on the specification. The requirements on resource bounds are naturally encoded in the symbolic specification, followed by the automatic synthesis of an optimal controller with respect to the probability of success. Secondly, the thesis proposes an online algorithm called greedy Buchi algorithm (GBA) that reduces the synthesis problem size to avoid the scalability problem. A framework is then presented with realistic control dynamics and physical assumptions in the environment such as wind estimation and fuel constraints. The time and space complexity for the framework is polynomial in the size of the system state, which is efficient for online synthesis. Lastly, the thesis proposes a synthesis algorithm for an optimal controller with respect to completion time given the minimum safety constraints. The algorithm naturally balances between completion time and safety. This work proves an analytical relationship between the probability of success and the conditional completion time given the mission specification. The theoretical contributions in this thesis are validated through realistic simulation examples. This thesis identifies and solves two core problems that contribute to the overall vision of developing a theoretical basis for trusted behaviour in outdoor robots. These contributions serve as a foundation for further research in multi-constrained task planning where a number of different constraints are considered simultaneously within a single framework

    Unmanned Vehicle Systems & Operations on Air, Sea, Land

    Get PDF
    Unmanned Vehicle Systems & Operations On Air, Sea, Land is our fourth textbook in a series covering the world of Unmanned Aircraft Systems (UAS) and Counter Unmanned Aircraft Systems (CUAS). (Nichols R. K., 2018) (Nichols R. K., et al., 2019) (Nichols R. , et al., 2020)The authors have expanded their purview beyond UAS / CUAS systems. Our title shows our concern for growth and unique cyber security unmanned vehicle technology and operations for unmanned vehicles in all theaters: Air, Sea and Land – especially maritime cybersecurity and China proliferation issues. Topics include: Information Advances, Remote ID, and Extreme Persistence ISR; Unmanned Aerial Vehicles & How They Can Augment Mesonet Weather Tower Data Collection; Tour de Drones for the Discerning Palate; Underwater Autonomous Navigation & other UUV Advances; Autonomous Maritime Asymmetric Systems; UUV Integrated Autonomous Missions & Drone Management; Principles of Naval Architecture Applied to UUV’s; Unmanned Logistics Operating Safely and Efficiently Across Multiple Domains; Chinese Advances in Stealth UAV Penetration Path Planning in Combat Environment; UAS, the Fourth Amendment and Privacy; UV & Disinformation / Misinformation Channels; Chinese UAS Proliferation along New Silk Road Sea / Land Routes; Automaton, AI, Law, Ethics, Crossing the Machine – Human Barrier and Maritime Cybersecurity.Unmanned Vehicle Systems are an integral part of the US national critical infrastructure The authors have endeavored to bring a breadth and quality of information to the reader that is unparalleled in the unclassified sphere. Unmanned Vehicle (UV) Systems & Operations On Air, Sea, Land discusses state-of-the-art technology / issues facing U.S. UV system researchers / designers / manufacturers / testers. We trust our newest look at Unmanned Vehicles in Air, Sea, and Land will enrich our students and readers understanding of the purview of this wonderful technology we call UV.https://newprairiepress.org/ebooks/1035/thumbnail.jp

    Active End-Effector Pose Selection for Tactile Object Recognition through Monte Carlo Tree Search

    Full text link
    This paper considers the problem of active object recognition using touch only. The focus is on adaptively selecting a sequence of wrist poses that achieves accurate recognition by enclosure grasps. It seeks to minimize the number of touches and maximize recognition confidence. The actions are formulated as wrist poses relative to each other, making the algorithm independent of absolute workspace coordinates. The optimal sequence is approximated by Monte Carlo tree search. We demonstrate results in a physics engine and on a real robot. In the physics engine, most object instances were recognized in at most 16 grasps. On a real robot, our method recognized objects in 2--9 grasps and outperformed a greedy baseline.Comment: Accepted to International Conference on Intelligent Robots and Systems (IROS) 201

    Active End-Effector Pose Selection for Tactile Object Recognition through Monte Carlo Tree Search

    Full text link
    This paper considers the problem of active object recognition using touch only. The focus is on adaptively selecting a sequence of wrist poses that achieves accurate recognition by enclosure grasps. It seeks to minimize the number of touches and maximize recognition confidence. The actions are formulated as wrist poses relative to each other, making the algorithm independent of absolute workspace coordinates. The optimal sequence is approximated by Monte Carlo tree search. We demonstrate results in a physics engine and on a real robot. In the physics engine, most object instances were recognized in at most 16 grasps. On a real robot, our method recognized objects in 2--9 grasps and outperformed a greedy baseline.Comment: Accepted to International Conference on Intelligent Robots and Systems (IROS) 201
    • …
    corecore