64 research outputs found

    Recent Advances in Scaling Up Gaussian Process Predictive Models for Large Spatiotemporal Data

    Get PDF
    The expressive power of Gaussian process (GP) models comes at a cost of poor scalability in the size of the data. To improve their scalability, this paper presents an overview of our recent progress in scaling up GP models for large spatiotemporally correlated data through parallelization on clusters of machines, online learning, and nonmyopic active sensing/learning.Singapore-MIT Alliance (Subaward Agreement No. 41)Singapore-MIT Alliance (Subaward Agreement No. 52

    Technical Report: A Receding Horizon Algorithm for Informative Path Planning with Temporal Logic Constraints

    Full text link
    This technical report is an extended version of the paper 'A Receding Horizon Algorithm for Informative Path Planning with Temporal Logic Constraints' accepted to the 2013 IEEE International Conference on Robotics and Automation (ICRA). This paper considers the problem of finding the most informative path for a sensing robot under temporal logic constraints, a richer set of constraints than have previously been considered in information gathering. An algorithm for informative path planning is presented that leverages tools from information theory and formal control synthesis, and is proven to give a path that satisfies the given temporal logic constraints. The algorithm uses a receding horizon approach in order to provide a reactive, on-line solution while mitigating computational complexity. Statistics compiled from multiple simulation studies indicate that this algorithm performs better than a baseline exhaustive search approach.Comment: Extended version of paper accepted to 2013 IEEE International Conference on Robotics and Automation (ICRA

    Learning to soar: exploration strategies in reinforcement learning for resource-constrained missions

    Get PDF
    An unpowered aerial glider learning to soar in a wind field presents a new manifestation of the exploration-exploitation trade-off. This thesis proposes a directed, adaptive and nonmyopic exploration strategy in a temporal difference reinforcement learning framework for tackling the resource-constrained exploration-exploitation task of this autonomous soaring problem. The complete learning algorithm is developed in a SARSA() framework, which uses a Gaussian process with a squared exponential covariance function to approximate the value function. The three key contributions of this thesis form the proposed exploration-exploitation strategy. Firstly, a new information measure is derived from the change in the variance volume surrounding the Gaussian process estimate. This measure of information gain is used to define the exploration reward of an observation. Secondly, a nonmyopic information value is presented that captures both the immediate exploration reward due to taking an action as well as future exploration opportunities that result. Finally, this information value is combined with the state-action value of SARSA() through a dynamic weighting factor to produce an exploration-exploitation management scheme for resource-constrained learning systems. The proposed learning strategy encourages either exploratory or exploitative behaviour depending on the requirements of the learning task and the available resources. The performance of the learning algorithms presented in this thesis is compared against other SARSA() methods. Results show that actively directing exploration to regions of the state-action space with high uncertainty improves the rate of learning, while dynamic management of the exploration-exploitation behaviour according to the available resources produces prudent learning behaviour in resource-constrained systems

    Long-term Informative Path Planning with Autonomous Soaring

    Get PDF
    The ability of UAVs to cover large areas efficiently is valuable for information gathering missions. For long-term information gathering, a UAV may extend its endurance by accessing energy sources present in the atmosphere. Thermals are a favourable source of wind energy and thermal soaring is adopted in this thesis to enable long-term information gathering. This thesis proposes energy-constrained path planning algorithms for a gliding UAV to maximise information gain given a mission time that greatly exceeds the UAV's endurance. This thesis is motivated by the problem of probabilistic target-search performed by an energy-constrained UAV, which is tasked to simultaneously search for a lost ground target and explore for thermals to regain energy. This problem is termed informative soaring (IFS) and combines informative path planning (IPP) with energy constraints. IFS is shown to be NP-hard by showing that it has a similar problem structure to the weight-constrained shortest path problem with replenishments. While an optimal solution may not exist in polynomial time, this thesis proposes path planning algorithms based on informed tree search to find high quality plans with low computational cost. This thesis addresses complex probabilistic belief maps and three primary contributions are presented: • First, IFS is formulated as a graph search problem by observing that any feasible long-term plan must alternate between 1) information gathering between thermals and 2) replenishing energy within thermals. This is a first step to reducing the large search state space. • The second contribution is observing that a complex belief map can be viewed as a collection of information clusters and using a divide and conquer approach, cluster tree search (CTS), to efficiently find high-quality plans in the large search state space. In CTS, near-greedy tree search is used to find locally optimal plans and two global planning versions are proposed to combine local plans into a full plan. Monte Carlo simulation studies show that CTS produces similar plans to variations of exhaustive search, but runs five to 20 times faster. The more computationally efficient version, CTSDP, uses dynamic programming (DP) to optimally combine local plans. CTSDP is executed in real time on board a UAV to demonstrate computational feasibility. • The third contribution is an extension of CTS to unknown drifting thermals. A thermal exploration map is created to detect new thermals that will eventually intercept clusters, and therefore be valuable to the mission. Time windows are computed for known thermals and an optimal cluster visit schedule is formed. A tree search algorithm called CTSDrift combines CTS and thermal exploration. Using 2400 Monte Carlo simulations, CTSDrift is evaluated against a Full Knowledge method that has full knowledge of the thermal field and a Greedy method. On average, CTSDrift outperforms Greedy in one-third of trials, and achieves similar performance to Full Knowledge when environmental conditions are favourable

    Adaptive Informative Path Planning with Multimodal Sensing

    Full text link
    Adaptive Informative Path Planning (AIPP) problems model an agent tasked with obtaining information subject to resource constraints in unknown, partially observable environments. Existing work on AIPP has focused on representing observations about the world as a result of agent movement. We formulate the more general setting where the agent may choose between different sensors at the cost of some energy, in addition to traversing the environment to gather information. We call this problem AIPPMS (MS for Multimodal Sensing). AIPPMS requires reasoning jointly about the effects of sensing and movement in terms of both energy expended and information gained. We frame AIPPMS as a Partially Observable Markov Decision Process (POMDP) and solve it with online planning. Our approach is based on the Partially Observable Monte Carlo Planning framework with modifications to ensure constraint feasibility and a heuristic rollout policy tailored for AIPPMS. We evaluate our method on two domains: a simulated search-and-rescue scenario and a challenging extension to the classic RockSample problem. We find that our approach outperforms a classic AIPP algorithm that is modified for AIPPMS, as well as online planning using a random rollout policy.Comment: First two authors contributed equally; International Conference on Automated Planning and Scheduling (ICAPS) 202

    Learning to soar: exploration strategies in reinforcement learning for resource-constrained missions

    Get PDF
    An unpowered aerial glider learning to soar in a wind field presents a new manifestation of the exploration-exploitation trade-off. This thesis proposes a directed, adaptive and nonmyopic exploration strategy in a temporal difference reinforcement learning framework for tackling the resource-constrained exploration-exploitation task of this autonomous soaring problem. The complete learning algorithm is developed in a SARSA() framework, which uses a Gaussian process with a squared exponential covariance function to approximate the value function. The three key contributions of this thesis form the proposed exploration-exploitation strategy. Firstly, a new information measure is derived from the change in the variance volume surrounding the Gaussian process estimate. This measure of information gain is used to define the exploration reward of an observation. Secondly, a nonmyopic information value is presented that captures both the immediate exploration reward due to taking an action as well as future exploration opportunities that result. Finally, this information value is combined with the state-action value of SARSA() through a dynamic weighting factor to produce an exploration-exploitation management scheme for resource-constrained learning systems. The proposed learning strategy encourages either exploratory or exploitative behaviour depending on the requirements of the learning task and the available resources. The performance of the learning algorithms presented in this thesis is compared against other SARSA() methods. Results show that actively directing exploration to regions of the state-action space with high uncertainty improves the rate of learning, while dynamic management of the exploration-exploitation behaviour according to the available resources produces prudent learning behaviour in resource-constrained systems

    Adaptive sampling of transient environmental phenomena with autonomous mobile platforms

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Master of Science in Aeronautics and Astronautics at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution September 2019.In the environmental and earth sciences, hypotheses about transient phenomena have been universally investigated by collecting physical sample materials and performing ex situ analysis. Although the gold standard, logistical challenges limit the overall efficacy: the number of samples are limited to what can be stored and transported, human experts must be able to safely access or directly observe the target site, and time in the field and subsequently the laboratory, increases overall campaign expense. As a result, the temporal detail and spatial diversity in the samples may fail to capture insightful structure of the phenomenon of interest. The development of in situ instrumentation allows for near real-time analysis of physical phenomenon through observational strategies (e.g., optical), and in combination with unmanned mobile platforms, has considerably impacted field operations in the sciences. In practice, mobile platforms are either remotely operated or perform guided, supervised autonomous missions specified as navigation between humanselected waypoints. Missions like these are useful for gaining insight about a particular target site, but can be sample-sparse in scientifically valuable regions, particularly in complex or transient distributions. A skilled human expert and pilot can dynamically adjust mission trajectories based on sensor information. Encoding their insight onto a vehicle to enable adaptive sampling behaviors can broadly increase the utility of mobile platforms in the sciences. This thesis presents three field campaigns conducted with a human-piloted marine surface vehicle, the ChemYak, to study the greenhouse gases methane (CH4) and carbon dioxide (CO2) in estuaries, rivers, and the open ocean. These studies illustrate the utility of mobile surface platforms for environmental research, and highlight key challenges of studying transient phenomenon. This thesis then formalizes the maximum seek-and-sample (MSS) adaptive sampling problem, which requires a mobile vehicle to efficiently find and densely sample from the most scientifically valuable region in an a priori unknown, dynamic environment. The PLUMES algorithm — Plume Localization under Uncertainty using Maximum-ValuE information and Search—is subsequently presented, which addresses the MSS problem and overcomes key technical challenges with planning in natural environments. Theoretical performance guarantees are derived for PLUMES, and empirical performance is demonstrated against canonical uniform search and state-of-the-art baselines in simulation and field trials. Ultimately, this thesis examines the challenges of autonomous informative sampling in the environmental and earth sciences. In order to create useful systems that perform diverse scientific objectives in natural environments, approaches from robotics planning, field design, Bayesian optimization, machine learning, and the sciences must be drawn together. PLUMES captures the breadth and depth required to solve a specific objective within adaptive sampling, and this work as a whole highlights the potential for mobile technologies to perform intelligent autonomous science in the future

    Information-Based Hierarchical Planning for a Mobile Sensing Network in Environmental Mapping

    Get PDF
    This article investigates the problem of information-based sampling design and path planning for a mobile sensing network to predict scalar fields of monitored environments. A hierarchical framework with a built-in Gaussian Markov random field model is proposed to provide adaptive sampling for efficient field reconstruction. In the proposed framework, a nonmyopic planner is operated at a sink to navigate the mobile sensing agents in the field to the sites that are most informative. Meanwhile, a myopic planner is carried out on board each agent. A tradeoff between computationally intensive global optimization and efficient local greedy search is incorporated into the system. The mobile sensing agents can be scheduled online through an anytime algorithm to visit and observe the high-information sites. Experiments on both synthetic and real-world datasets are used to demonstrate the feasibility and efficiency of the proposed planner in model exploitation and adaptive sampling for environmental field mapping
    • …
    corecore