73 research outputs found

    HyP-DESPOT: A Hybrid Parallel Algorithm for Online Planning under Uncertainty

    Full text link
    Planning under uncertainty is critical for robust robot performance in uncertain, dynamic environments, but it incurs high computational cost. State-of-the-art online search algorithms, such as DESPOT, have vastly improved the computational efficiency of planning under uncertainty and made it a valuable tool for robotics in practice. This work takes one step further by leveraging both CPU and GPU parallelization in order to achieve near real-time online planning performance for complex tasks with large state, action, and observation spaces. Specifically, we propose Hybrid Parallel DESPOT (HyP-DESPOT), a massively parallel online planning algorithm that integrates CPU and GPU parallelism in a multi-level scheme. It performs parallel DESPOT tree search by simultaneously traversing multiple independent paths using multi-core CPUs and performs parallel Monte-Carlo simulations at the leaf nodes of the search tree using GPUs. Experimental results show that HyP-DESPOT speeds up online planning by up to several hundred times, compared with the original DESPOT algorithm, in several challenging robotic tasks in simulation

    Tractable POMDP-planning for robots with complex non-linear dynamics

    Get PDF
    Planning under partial observability is an essential capability of autonomous robots. While robots operate in the real world, they are inherently subject to various uncertainties such a control and sensing errors, and limited information regarding the operating environment.Conceptually these type of planning problems can be solved in a principled manner when framed as a Partially Observable Markov Decision Process (POMDP). POMDPs model the aforementioned uncertainties as conditional probability functions and estimate the state of the system as probability functions over the state space, called beliefs. Instead of computing the best strategy with respect to single states, POMDP solvers compute the best strategy with respect to beliefs. Solving a POMDP exactly is computationally intractable in general.However, in the past two decades we have seen tremendous progress in the development of approximately optimal solvers that trade optimality for computational tractability. Despite this progress, approximately solving POMDPs for systems with complex non-linear dynamics remains challenging. Most state-of-the-art solvers rely on a large number of expensive forward simulations of the system to find an approximate-optimal strategy. For systems with complex non-linear dynamics that admit no closed-form solution, this strategy can become prohibitively expensive. Another difficulty in applying POMDPs to physical robots with complex transition dynamics is the fact that almost all implementations of state-of-the-art on-line POMDP solvers restrict the user to specific data structures for the POMDP model, and the model has to be hard-coded within the solver implementation. This, in turn, severely hinders the process of applying POMDPs to physical robots.In this thesis we aim to make POMDPs more practical for realistic robotic motion planning tasks under partial observability. We show that systematic approximations of complex, non-linear transition dynamics can be used to design on-line POMDP solvers that are more efficient than current solvers. Furthermore, we propose a new software-framework that supports the user in modeling complex planning problems under uncertainty with minimal implementation effort

    Online POMDP Planning for Vehicle Navigation in Densely Populated Area

    Get PDF
    Master'sMASTER OF SCIENC

    Stochastic Motion Planning For Mobile Robots

    Get PDF
    Stochastic motion planning is of crucial importance in robotic applications not only because of the imperfect models for robot dynamics and sensing but also the potentially unknown environment. Due to efficiency considerations, practical methods often introduce additional assumptions or heuristics, like the use of separation theorem, into the solution. However, there are intrinsic limitations of practical frameworks that prevent further improving reliability and robustness of the system, which cannot be addressed with minor tweaks. Therefore, it is necessary to develop theoretically justified solutions to stochastic motion planning problems. Despite the challenges in developing such solutions, the reward is unparalleled due to their wide impact on a majority of, if not all, robotic applications. The overall goal of this dissertation is to develop solutions for stochastic motion planning problems with theoretical justifications and demonstrate their superior performance in real world applications. In the first part of this dissertation, we model the stochastic motion planning problem as Partially Observable Markov Decision Processes (POMDP) and propose two solutions featuring different optimization regimes trading off model generality and efficiency. The first is a gradient-based solution based on iterative Linear Quadratic Gaussian (iLQG) assuming explicit model formulations and Gaussian noises. The special structure of the problem allows a time-varying affine policy to be solved offline and leads to efficient online usage. The proposed algorithm addresses limitations of previous works on iLQG in working with nondifferentiable system models and sparse informative measurements. The second solution is a sampled-based general POMDP solver assuming mild conditions on the control space and measurement models. The generality of the problem formulation promises wide applications of the algorithm. The proposed solution addresses the degeneracy issue of Monte Carlo tree search when applied to continuous POMDPs, especially for systems with continuous measurement space. Through theoretical analysis, we show that the proposed algorithm is a valid Monte Carlo control algorithm alternating unbiased policy evaluation and policy improvement. In the second part of this dissertation, we apply the proposed solutions to different robotic applications where the dominant uncertainty either comes from the robot itself or external environment. We first consider the the application of mobile robot navigation in known environment where the major sources of uncertainties are the robot dynamical and sensing noises. Although the problem is widely studied, few work has applied POMDP solutions to the application. By demonstrating the superior performance of proposed solutions on such a familiar application, the importance of stochastic motion planning may be better appreciated by the robotics community. We also apply the proposed solutions to autonomous driving where the dominant uncertainty comes from the external environment, i.e. the unknown behavior of human drivers.In this work, we propose a data-driven model for the stochastic traffic dynamics where we explicitly model the intention of human drivers. To our best knowledge, this is the first work that applies POMDP solutions to data-driven traffic models. Through simulations, we show the proposed solutions are able to develop high-level intelligent behaviors and outperform other similar methods that also consider uncertainties in the autonomous driving application

    Rule-Based Policy Interpretation and Shielding for Partially Observable Monte Carlo Planning

    Get PDF
    Partially Observable Monte Carlo Planning (POMCP) is a powerful online algorithm that can generate approximate policies for large Partially Observable Markov Decision Processes. The online nature of this method supports scalability by avoiding complete policy representation. However, the lack of an explicit representation of the policy hinders interpretability. In this thesis, we propose a methodology based on Maximum Satisfiability Modulo Theory (MAX-SMT) for analyzing POMCP policies by inspecting their traces, namely, sequences of belief-action pairs generated by the algorithm. The proposed method explores local properties of the policy to build a compact and informative summary of the policy behaviour. This representation exploits a high-level description encoded using logical formulas that domain experts can provide. The final formula can be used to identify unexpected decisions, namely, decisions that violate the expert indications. We show that this identification process can be used offline (to improve the explainability of the policy and to identify anomalous behaviours) or online (to shield the decisions of the POMCP algorithm). We also present an active methodology that can effectively query a POMCP policy to build more reliable descriptions quickly. We extensively evaluate our methodologies on two standard benchmarks for POMDPs, namely, emph{tiger} and emph{rocksample}, and on a problem related to velocity regulation in mobile robot navigation. Results show that our approach achieves good performance due to its capability to exploit experts' knowledge of the domains. Specifically, our approach can be used both to identify anomalous behaviours in faulty POMCPs and to improve the performance of the system by using the shielding mechanism. In the first case, we test the methodology against a state-of-the-art anomaly detection algorithm, while in the second, we compared the performance of shielded and unshielded POMCPs. We implemented our methodology in CC, and the code is open-source and available at href{https://github.com/GiuMaz/XPOMCP}{https://github.com/GiuMaz/XPOMCP}

    Multi-Policy Decision Making for Reliable Navigation in Dynamic Uncertain Environments

    Full text link
    Navigating everyday social environments, in the presence of pedestrians and other dynamic obstacles remains one of the key challenges preventing mobile robots from leaving carefully designed spaces and entering our daily lives. The complex and tightly-coupled interactions between these agents make the environment dynamic and unpredictable, posing a formidable problem for robot motion planning. Trajectory planning methods, supported by models of typical human behavior and personal space, often produce reasonable behavior. However, they do not account for the future closed-loop interactions of other agents with the trajectory being constructed. As a consequence, the trajectories are unable to anticipate cooperative interactions (such as a human yielding), or adverse interactions (such as the robot blocking the way). Ideally, the robot must account for coupled agent-agent interactions while reasoning about possible future outcomes, and then take actions to advance towards its navigational goal without inconveniencing nearby pedestrians. Multi-Policy Decision Making (MPDM) is a novel framework for autonomous navigation in dynamic, uncertain environments where the robot's trajectory is not explicitly planned, but instead, the robot dynamically switches between a set of candidate closed-loop policies, allowing it to adapt to different situations encountered in such environments. The candidate policies are evaluated based on short-term (five-second) forward simulations of samples drawn from the estimated distribution of the agents' current states. These forward simulations and thereby the cost function, capture agent-agent interactions as well as agent-robot interactions which depend on the ego-policy being evaluated. In this thesis, we propose MPDM as a new method for navigation amongst pedestrians by dynamically switching from amongst a library of closed-loop policies. Due to real-time constraints, the robot's emergent behavior is directly affected by the quality of policy evaluation. Approximating how good a policy is based on only a few forward roll-outs is difficult, especially with the large space of possible pedestrian configurations and the sensitivity of the forward simulation to the sampled configurations. Traditional methods based on Monte-Carlo sampling often missed likely, high-cost outcomes, resulting in an over-optimistic evaluation of a policy and unreliable emergent behavior. By re-formulating policy evaluation as an optimization problem and enabling the quick discovery of potentially dangerous outcomes, we make MPDM more reliable and risk-aware. Even with the increased reliability, a major limitation is that MPDM requires the system designer to provide a set of carefully hand-crafted policies as it can evaluate only a few policies reliably in real-time. We radically enhance the expressivity of MPDM by allowing policies to have continuous-valued parameters, while simultaneously satisfying real-time constraints by quickly discovering promising policy parameters through a novel iterative gradient-based algorithm. Overall, we reformulate the traditional motion planning problem and paint it in a very different light --- as a bilevel optimization problem where the robot repeatedly discovers likely high-cost outcomes and adapts its policy parameters avoid these outcomes. We demonstrate significant performance benefits through extensive experiments in simulation as well as on a physical robot platform operating in a semi-crowded environment.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/150017/1/dhanvinm_1.pd

    Adaptive Robotic Information Gathering via Non-Stationary Gaussian Processes

    Full text link
    Robotic Information Gathering (RIG) is a foundational research topic that answers how a robot (team) collects informative data to efficiently build an accurate model of an unknown target function under robot embodiment constraints. RIG has many applications, including but not limited to autonomous exploration and mapping, 3D reconstruction or inspection, search and rescue, and environmental monitoring. A RIG system relies on a probabilistic model's prediction uncertainty to identify critical areas for informative data collection. Gaussian Processes (GPs) with stationary kernels have been widely adopted for spatial modeling. However, real-world spatial data is typically non-stationary -- different locations do not have the same degree of variability. As a result, the prediction uncertainty does not accurately reveal prediction error, limiting the success of RIG algorithms. We propose a family of non-stationary kernels named Attentive Kernel (AK), which is simple, robust, and can extend any existing kernel to a non-stationary one. We evaluate the new kernel in elevation mapping tasks, where AK provides better accuracy and uncertainty quantification over the commonly used stationary kernels and the leading non-stationary kernels. The improved uncertainty quantification guides the downstream informative planner to collect more valuable data around the high-error area, further increasing prediction accuracy. A field experiment demonstrates that the proposed method can guide an Autonomous Surface Vehicle (ASV) to prioritize data collection in locations with significant spatial variations, enabling the model to characterize salient environmental features.Comment: International Journal of Robotics Research (IJRR). arXiv admin note: text overlap with arXiv:2205.0642
    corecore