135 research outputs found

    Anytime Point-Based Approximations for Large POMDPs

    Full text link
    The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A well-known technique for speeding up POMDP solving involves performing value backups at specific belief points, rather than over the entire belief simplex. The efficiency of this approach, however, depends greatly on the selection of points. This paper presents a set of novel techniques for selecting informative belief points which work well in practice. The point selection procedure is combined with point-based value backups to form an effective anytime POMDP algorithm called Point-Based Value Iteration (PBVI). The first aim of this paper is to introduce this algorithm and present a theoretical analysis justifying the choice of belief selection technique. The second aim of this paper is to provide a thorough empirical comparison between PBVI and other state-of-the-art POMDP methods, in particular the Perseus algorithm, in an effort to highlight their similarities and differences. Evaluation is performed using both standard POMDP domains and realistic robotic tasks

    Decision-Making Under Uncertainty: Beyond Probabilities

    Full text link
    This position paper reflects on the state-of-the-art in decision-making under uncertainty. A classical assumption is that probabilities can sufficiently capture all uncertainty in a system. In this paper, the focus is on the uncertainty that goes beyond this classical interpretation, particularly by employing a clear distinction between aleatoric and epistemic uncertainty. The paper features an overview of Markov decision processes (MDPs) and extensions to account for partial observability and adversarial behavior. These models sufficiently capture aleatoric uncertainty but fail to account for epistemic uncertainty robustly. Consequently, we present a thorough overview of so-called uncertainty models that exhibit uncertainty in a more robust interpretation. We show several solution techniques for both discrete and continuous models, ranging from formal verification, over control-based abstractions, to reinforcement learning. As an integral part of this paper, we list and discuss several key challenges that arise when dealing with rich types of uncertainty in a model-based fashion

    Environment Search Planning Subject to High Robot Localization Uncertainty

    Get PDF
    As robots find applications in more complex roles, ranging from search and rescue to healthcare and services, they must be robust to greater levels of localization uncertainty and uncertainty about their environments. Without consideration for such uncertainties, robots will not be able to compensate accordingly, potentially leading to mission failure or injury to bystanders. This work addresses the task of searching a 2D area while reducing localization uncertainty. Wherein, the environment provides low uncertainty pose updates from beacons with a short range, covering only part of the environment. Otherwise the robot localizes using dead reckoning, relying on wheel encoder and yaw rate information from a gyroscope. As such, outside of the regions with position updates, there will be unconstrained localization error growth over time. The work contributes a Belief Markov Decision Process formulation for solving the search problem and evaluates the performance using Partially Observable Monte Carlo Planning (POMCP). Additionally, the work contributes an approximate Markov Decision Process formulation and reduced complexity state representation. The approximate problem is evaluated using value iteration. To provide a baseline, the Google OR-Tools package is used to solve the travelling salesman problem (TSP). Results are verified by simulating a differential drive robot in the Gazebo simulation environment. POMCP results indicate planning can be tuned to prioritize constraining uncertainty at the cost of increasing path length. The MDP formulation provides consistently lower uncertainty with minimal increases in path length over the TSP solution. Both formulations show improved coverage outcomes

    Human-Centered Autonomy for UAS Target Search

    Full text link
    Current methods of deploying robots that operate in dynamic, uncertain environments, such as Uncrewed Aerial Systems in search \& rescue missions, require nearly continuous human supervision for vehicle guidance and operation. These methods do not consider high-level mission context resulting in cumbersome manual operation or inefficient exhaustive search patterns. We present a human-centered autonomous framework that infers geospatial mission context through dynamic feature sets, which then guides a probabilistic target search planner. Operators provide a set of diverse inputs, including priority definition, spatial semantic information about ad-hoc geographical areas, and reference waypoints, which are probabilistically fused with geographical database information and condensed into a geospatial distribution representing an operator's preferences over an area. An online, POMDP-based planner, optimized for target searching, is augmented with this reward map to generate an operator-constrained policy. Our results, simulated based on input from five professional rescuers, display effective task mental model alignment, 18\% more victim finds, and 15 times more efficient guidance plans then current operational methods.Comment: Extended version to ICRA conference submission. 9 pages, 5 figure

    HARPS: An Online POMDP Framework for Human-Assisted Robotic Planning and Sensing

    Full text link
    Autonomous robots can benefit greatly from human-provided semantic characterizations of uncertain task environments and states. However, the development of integrated strategies which let robots model, communicate, and act on such 'soft data' remains challenging. Here, the Human Assisted Robotic Planning and Sensing (HARPS) framework is presented for active semantic sensing and planning in human-robot teams to address these gaps by formally combining the benefits of online sampling-based POMDP policies, multimodal semantic interaction, and Bayesian data fusion. This approach lets humans opportunistically impose model structure and extend the range of semantic soft data in uncertain environments by sketching and labeling arbitrary landmarks across the environment. Dynamic updating of the environment model while during search allows robotic agents to actively query humans for novel and relevant semantic data, thereby improving beliefs of unknown environments and states for improved online planning. Simulations of a UAV-enabled target search application in a large-scale partially structured environment show significant improvements in time and belief state estimates required for interception versus conventional planning based solely on robotic sensing. Human subject studies in the same environment (n = 36) demonstrate an average doubling in dynamic target capture rate compared to the lone robot case, and highlight the robustness of active probabilistic reasoning and semantic sensing over a range of user characteristics and interaction modalities

    Balancing exploration and exploitation: task-targeted exploration for scientific decision-making

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution September 2022.How do we collect observational data that reveal fundamental properties of scientific phenomena? This is a key challenge in modern scientific discovery. Scientific phenomena are complex—they have high-dimensional and continuous state, exhibit chaotic dynamics, and generate noisy sensor observations. Additionally, scientific experimentation often requires significant time, money, and human effort. In the face of these challenges, we propose to leverage autonomous decision-making to augment and accelerate human scientific discovery. Autonomous decision-making in scientific domains faces an important and classical challenge: balancing exploration and exploitation when making decisions under uncertainty. This thesis argues that efficient decision-making in real-world, scientific domains requires task-targeted exploration—exploration strategies that are tuned to a specific task. By quantifying the change in task performance due to exploratory actions, we enable decision-makers that can contend with highly uncertain real-world environments, performing exploration parsimoniously to improve task performance. The thesis presents three novel paradigms for task-targeted exploration that are motivated by and applied to real-world scientific problems. We first consider exploration in partially observable Markov decision processes (POMDPs) and present two novel planners that leverage task-driven information measures to balance exploration and exploitation. These planners drive robots in simulation and oceanographic field trials to robustly identify plume sources and track targets with stochastic dynamics. We next consider the exploration- exploitation trade-off in online learning paradigms, a robust alternative to POMDPs when the environment is adversarial or difficult to model. We present novel online learning algorithms that balance exploitative and exploratory plays optimally under real-world constraints, including delayed feedback, partial predictability, and short regret horizons. We use these algorithms to perform model selection for subseasonal temperature and precipitation forecasting, achieving state-of-the-art forecasting accuracy. The human scientific endeavor is poised to benefit from our emerging capacity to integrate observational data into the process of model development and validation. Realizing the full potential of these data requires autonomous decision-makers that can contend with the inherent uncertainty of real-world scientific domains. This thesis highlights the critical role that task-targeted exploration plays in efficient scientific decision-making and proposes three novel methods to achieve task-targeted exploration in real-world oceanographic and climate science applications.This material is based upon work supported by the NSF Graduate Research Fellowship Program and a Microsoft Research PhD Fellowship, as well as the Department of Energy / National Nuclear Security Administration under Award Number DE-NA0003921, the Office of Naval Research under Award Number N00014-17-1-2072, and DARPA under Award Number HR001120C0033
    • …
    corecore