73 research outputs found
HyP-DESPOT: A Hybrid Parallel Algorithm for Online Planning under Uncertainty
Planning under uncertainty is critical for robust robot performance in
uncertain, dynamic environments, but it incurs high computational cost.
State-of-the-art online search algorithms, such as DESPOT, have vastly improved
the computational efficiency of planning under uncertainty and made it a
valuable tool for robotics in practice. This work takes one step further by
leveraging both CPU and GPU parallelization in order to achieve near real-time
online planning performance for complex tasks with large state, action, and
observation spaces. Specifically, we propose Hybrid Parallel DESPOT
(HyP-DESPOT), a massively parallel online planning algorithm that integrates
CPU and GPU parallelism in a multi-level scheme. It performs parallel DESPOT
tree search by simultaneously traversing multiple independent paths using
multi-core CPUs and performs parallel Monte-Carlo simulations at the leaf nodes
of the search tree using GPUs. Experimental results show that HyP-DESPOT speeds
up online planning by up to several hundred times, compared with the original
DESPOT algorithm, in several challenging robotic tasks in simulation
Tractable POMDP-planning for robots with complex non-linear dynamics
Planning under partial observability is an essential capability of autonomous robots. While robots operate in the real world, they are inherently subject to various uncertainties such a control and sensing errors, and limited information regarding the operating environment.Conceptually these type of planning problems can be solved in a principled manner when framed as a Partially Observable Markov Decision Process (POMDP). POMDPs model the aforementioned uncertainties as conditional probability functions and estimate the state of the system as probability functions over the state space, called beliefs. Instead of computing the best strategy with respect to single states, POMDP solvers compute the best strategy with respect to beliefs. Solving a POMDP exactly is computationally intractable in general.However, in the past two decades we have seen tremendous progress in the development of approximately optimal solvers that trade optimality for computational tractability. Despite this progress, approximately solving POMDPs for systems with complex non-linear dynamics remains challenging. Most state-of-the-art solvers rely on a large number of expensive forward simulations of the system to find an approximate-optimal strategy. For systems with complex non-linear dynamics that admit no closed-form solution, this strategy can become prohibitively expensive. Another difficulty in applying POMDPs to physical robots with complex transition dynamics is the fact that almost all implementations of state-of-the-art on-line POMDP solvers restrict the user to specific data structures for the POMDP model, and the model has to be hard-coded within the solver implementation. This, in turn, severely hinders the process of applying POMDPs to physical robots.In this thesis we aim to make POMDPs more practical for realistic robotic motion planning tasks under partial observability. We show that systematic approximations of complex, non-linear transition dynamics can be used to design on-line POMDP solvers that are more efficient than current solvers. Furthermore, we propose a new software-framework that supports the user in modeling complex planning problems under uncertainty with minimal implementation effort
Online POMDP Planning for Vehicle Navigation in Densely Populated Area
Master'sMASTER OF SCIENC
Stochastic Motion Planning For Mobile Robots
Stochastic motion planning is of crucial importance in robotic applications not only because of the imperfect models for robot dynamics and sensing but also the potentially unknown environment. Due to efficiency considerations, practical methods often introduce additional assumptions or heuristics, like the use of separation theorem, into the solution. However, there are intrinsic limitations of practical frameworks that prevent further improving reliability and robustness of the system, which cannot be addressed with minor tweaks. Therefore, it is necessary to develop theoretically justified solutions to stochastic motion planning problems. Despite the challenges in developing such solutions, the reward is unparalleled due to their wide impact on a majority of, if not all, robotic applications. The overall goal of this dissertation is to develop solutions for stochastic motion planning problems with theoretical justifications and demonstrate their superior performance in real world applications.
In the first part of this dissertation, we model the stochastic motion planning problem as Partially Observable Markov Decision Processes (POMDP) and propose two solutions featuring different optimization regimes trading off model generality and efficiency. The first is a gradient-based solution based on iterative Linear Quadratic Gaussian (iLQG) assuming explicit model formulations and Gaussian noises. The special structure of the problem allows a time-varying affine policy to be solved offline and leads to efficient online usage. The proposed algorithm addresses limitations of previous works on iLQG in working with nondifferentiable system models and sparse informative measurements. The second solution is a sampled-based general POMDP solver assuming mild conditions on the control space and measurement models. The generality of the problem formulation promises wide applications of the algorithm. The proposed solution addresses the degeneracy issue of Monte Carlo tree search when applied to continuous POMDPs, especially for systems with continuous measurement space. Through theoretical analysis, we show that the proposed algorithm is a valid Monte Carlo control algorithm alternating unbiased policy evaluation and policy improvement.
In the second part of this dissertation, we apply the proposed solutions to different robotic applications where the dominant uncertainty either comes from the robot itself or external environment. We first consider the the application of mobile robot navigation in known environment where the major sources of uncertainties are the robot dynamical and sensing noises. Although the problem is widely studied, few work has applied POMDP solutions to the application. By demonstrating the superior performance of proposed solutions on such a familiar application, the importance of stochastic motion planning may be better appreciated by the robotics community. We also apply the proposed solutions to autonomous driving where the dominant uncertainty comes from the external environment, i.e. the unknown behavior of human drivers.In this work, we propose a data-driven model for the stochastic traffic dynamics where we explicitly model the intention of human drivers. To our best knowledge, this is the first work that applies POMDP solutions to data-driven traffic models. Through simulations, we show the proposed solutions are able to develop high-level intelligent behaviors and outperform other similar methods that also consider uncertainties in the autonomous driving application
Rule-Based Policy Interpretation and Shielding for Partially Observable Monte Carlo Planning
Partially Observable Monte Carlo Planning (POMCP) is a powerful online algorithm that can generate approximate policies for large Partially Observable Markov Decision Processes. The online nature of this method supports scalability by avoiding complete policy representation. However, the lack of an explicit representation of the policy hinders interpretability. In this thesis, we propose a methodology based on Maximum Satisfiability Modulo Theory (MAX-SMT) for analyzing POMCP policies by inspecting their traces, namely, sequences of belief-action pairs generated by the algorithm. The proposed method explores local properties of the policy to build a compact and informative summary of the policy behaviour. This representation exploits a high-level description encoded using logical formulas that domain experts can provide. The final formula can be used to identify unexpected decisions, namely, decisions that violate the expert indications. We show that this identification process can be used offline (to improve the explainability of the policy and to identify anomalous behaviours) or online (to shield the decisions of the POMCP algorithm). We also present an active methodology that can effectively query a POMCP policy to build more reliable descriptions quickly. We extensively evaluate our methodologies on two standard benchmarks for POMDPs, namely, emph{tiger} and emph{rocksample}, and on a problem related to velocity regulation in mobile robot navigation. Results show that our approach achieves good performance due to its capability to exploit experts' knowledge of the domains. Specifically, our approach can be used both to identify anomalous behaviours in faulty POMCPs and to improve the performance of the system by using the shielding mechanism. In the first case, we test the methodology against a state-of-the-art anomaly detection algorithm, while in the second, we compared the performance of shielded and unshielded POMCPs. We implemented our methodology in CC, and the code is open-source and available at href{https://github.com/GiuMaz/XPOMCP}{https://github.com/GiuMaz/XPOMCP}
Multi-Policy Decision Making for Reliable Navigation in Dynamic Uncertain Environments
Navigating everyday social environments, in the presence of pedestrians and other dynamic obstacles remains one of the key challenges preventing mobile robots from leaving carefully designed spaces and entering our daily lives. The complex and tightly-coupled interactions between these agents make the environment dynamic and unpredictable, posing a formidable problem for robot motion planning. Trajectory planning methods, supported by models of typical human behavior and personal space, often produce reasonable behavior. However, they do not account for the future closed-loop interactions of other agents with the trajectory being constructed. As a consequence, the trajectories are unable to anticipate cooperative interactions (such as a human yielding), or adverse interactions (such as the robot blocking the way). Ideally, the robot must account for coupled agent-agent interactions while reasoning about possible future outcomes, and then take actions to advance towards its navigational goal without inconveniencing nearby pedestrians.
Multi-Policy Decision Making (MPDM) is a novel framework for autonomous navigation in dynamic, uncertain environments where the robot's trajectory is not explicitly planned, but instead, the robot dynamically switches between a set of candidate closed-loop policies, allowing it to adapt to different situations encountered in such environments. The candidate policies are evaluated based on short-term (five-second) forward simulations of samples drawn from the estimated distribution of the agents' current states. These forward simulations and thereby the cost function, capture agent-agent interactions as well as agent-robot interactions which depend on the ego-policy being evaluated.
In this thesis, we propose MPDM as a new method for navigation amongst pedestrians by dynamically switching from amongst a library of closed-loop policies. Due to real-time constraints, the robot's emergent behavior is directly affected by the quality of policy evaluation. Approximating how good a policy is based on only a few forward roll-outs is difficult, especially with the large space of possible pedestrian configurations and the sensitivity of the forward simulation to the sampled configurations. Traditional methods based on Monte-Carlo sampling often missed likely, high-cost outcomes, resulting in an over-optimistic evaluation of a policy and unreliable emergent behavior. By re-formulating policy evaluation as an optimization problem and enabling the quick discovery of potentially dangerous outcomes, we make MPDM more reliable and risk-aware.
Even with the increased reliability, a major limitation is that MPDM requires the system designer to provide a set of carefully hand-crafted policies as it can evaluate only a few policies reliably in real-time. We radically enhance the expressivity of MPDM by allowing policies to have continuous-valued parameters, while simultaneously satisfying real-time constraints by quickly discovering promising policy parameters through a novel iterative gradient-based algorithm. Overall, we reformulate the traditional motion planning problem and paint it in a very different light --- as a bilevel optimization problem where the robot repeatedly discovers likely high-cost outcomes and adapts its policy parameters avoid these outcomes. We demonstrate significant performance benefits through extensive experiments in simulation as well as on a physical robot platform operating in a semi-crowded environment.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/150017/1/dhanvinm_1.pd
Adaptive Robotic Information Gathering via Non-Stationary Gaussian Processes
Robotic Information Gathering (RIG) is a foundational research topic that
answers how a robot (team) collects informative data to efficiently build an
accurate model of an unknown target function under robot embodiment
constraints. RIG has many applications, including but not limited to autonomous
exploration and mapping, 3D reconstruction or inspection, search and rescue,
and environmental monitoring. A RIG system relies on a probabilistic model's
prediction uncertainty to identify critical areas for informative data
collection. Gaussian Processes (GPs) with stationary kernels have been widely
adopted for spatial modeling. However, real-world spatial data is typically
non-stationary -- different locations do not have the same degree of
variability. As a result, the prediction uncertainty does not accurately reveal
prediction error, limiting the success of RIG algorithms. We propose a family
of non-stationary kernels named Attentive Kernel (AK), which is simple, robust,
and can extend any existing kernel to a non-stationary one. We evaluate the new
kernel in elevation mapping tasks, where AK provides better accuracy and
uncertainty quantification over the commonly used stationary kernels and the
leading non-stationary kernels. The improved uncertainty quantification guides
the downstream informative planner to collect more valuable data around the
high-error area, further increasing prediction accuracy. A field experiment
demonstrates that the proposed method can guide an Autonomous Surface Vehicle
(ASV) to prioritize data collection in locations with significant spatial
variations, enabling the model to characterize salient environmental features.Comment: International Journal of Robotics Research (IJRR). arXiv admin note:
text overlap with arXiv:2205.0642
Recommended from our members
Belief-Space Planning for Resourceful Manipulation and Mobility
Robots are increasingly expected to work in partially observable and unstructured environments. They need to select actions that exploit perceptual and motor resourcefulness to manage uncertainty based on the demands of the task and environment. The research in this dissertation makes two primary contributions. First, it develops a new concept in resourceful robot platforms called the UMass uBot and introduces the sixth and seventh in the uBot series. uBot-6 introduces multiple postural configurations that enable different modes of mobility and manipulation to meet the needs of a wide variety of tasks and environmental constraints. uBot-7 extends this with the use of series elastic actuators (SEAs) to improve manipulation capabilities and support safer operation around humans. The resourcefulness of these robots is complemented with a belief-space planning framework that enables task-driven action selection in the context of the partially observable environment. The framework uses a compact but expressive state representation based on object models. We extend an existing affordance-based object model, called an aspect transition graph (ATG), with geometric information. This enables object-centric modeling of features and actions, making the model much more expressive without increasing the complexity. A novel task representation enables the belief-space planner to perform general object-centric tasks ranging from recognition to manipulation of objects. The approach supports the efficient handling of multi-object scenes. The combination of the physical platform and the planning framework are evaluated in two novel, challenging, partially observable planning domains. The ARcube domain provides a large population of objects that are highly ambiguous. Objects can only be differentiated using multi-modal sensor information and manual interactions. In the dexterous mobility domain, a robot can employ multiple mobility modes to complete navigation tasks under a variety of possible environment constraints. The performance of the proposed approach is evaluated using experiments in simulation and on a real robot
- …