6 research outputs found

    Anytime Point-Based Approximations for Large POMDPs

    Full text link
    The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A well-known technique for speeding up POMDP solving involves performing value backups at specific belief points, rather than over the entire belief simplex. The efficiency of this approach, however, depends greatly on the selection of points. This paper presents a set of novel techniques for selecting informative belief points which work well in practice. The point selection procedure is combined with point-based value backups to form an effective anytime POMDP algorithm called Point-Based Value Iteration (PBVI). The first aim of this paper is to introduce this algorithm and present a theoretical analysis justifying the choice of belief selection technique. The second aim of this paper is to provide a thorough empirical comparison between PBVI and other state-of-the-art POMDP methods, in particular the Perseus algorithm, in an effort to highlight their similarities and differences. Evaluation is performed using both standard POMDP domains and realistic robotic tasks

    Applying Metric-Trees to Belief-Point POMDPs

    No full text
    The recent development of grid-based and point-based approximation algorithms for POMDPs has greatly improved the tractability of POMDP planning. These approaches operate on sets of belief points by individually learning a value function for each point. In reality, belief points exist in a highly-structured metric simplex, but current POMDP algorithms do not exploit this property. This paper presents a new metric-tree algorithm which can be used in the context of POMDP planning to sort belief points spatially, and then perform fast value function updates over groups of points. We present results showing that this approach can significantly accelerate point-based POMDP algorithms for a wide range of problems

    Parametric POMDPs for planning in continuous state spaces

    Get PDF
    This thesis is concerned with planning and acting under uncertainty in partially-observable continuous domains. In particular, it focusses on the problem of mobile robot navigation given a known map. The dominant paradigm for robot localisation is to use Bayesian estimation to maintain a probability distribution over possible robot poses. In contrast, control algorithms often base their decisions on the assumption that a single state, such as the mode of this distribution, is correct. In scenarios involving significant uncertainty, this can lead to serious control errors. It is generally agreed that the reliability of navigation in uncertain environments would be greatly improved by the ability to consider the entire distribution when acting, rather than the single most likely state. The framework adopted in this thesis for modelling navigation problems mathematically is the Partially Observable Markov Decision Process (POMDP). An exact solution to a POMDP problem provides the optimal balance between reward-seeking behaviour and information-seeking behaviour, in the presence of sensor and actuation noise. Unfortunately, previous exact and approximate solution methods have had difficulty scaling to real applications. The contribution of this thesis is the formulation of an approach to planning in the space of continuous parameterised approximations to probability distributions. Theoretical and practical results are presented which show that, when compared with similar methods from the literature, this approach is capable of scaling to larger and more realistic problems. In order to apply the solution algorithm to real-world problems, a number of novel improvements are proposed. Specifically, Monte Carlo methods are employed to estimate distributions over future parameterised beliefs, improving planning accuracy without a loss of efficiency. Conditional independence assumptions are exploited to simplify the problem, reducing computational requirements. Scalability is further increased by focussing computation on likely beliefs, using metric indexing structures for efficient function approximation. Local online planning is incorporated to assist global offline planning, allowing the precision of the latter to be decreased without adversely affecting solution quality. Finally, the algorithm is implemented and demonstrated during real-time control of a mobile robot in a challenging navigation task. We argue that this task is substantially more challenging and realistic than previous problems to which POMDP solution methods have been applied. Results show that POMDP planning, which considers the evolution of the entire probability distribution over robot poses, produces significantly more robust behaviour when compared with a heuristic planner which considers only the most likely states and outcomes

    Parametric POMDPs for planning in continuous state spaces

    Get PDF
    This thesis is concerned with planning and acting under uncertainty in partially-observable continuous domains. In particular, it focusses on the problem of mobile robot navigation given a known map. The dominant paradigm for robot localisation is to use Bayesian estimation to maintain a probability distribution over possible robot poses. In contrast, control algorithms often base their decisions on the assumption that a single state, such as the mode of this distribution, is correct. In scenarios involving significant uncertainty, this can lead to serious control errors. It is generally agreed that the reliability of navigation in uncertain environments would be greatly improved by the ability to consider the entire distribution when acting, rather than the single most likely state. The framework adopted in this thesis for modelling navigation problems mathematically is the Partially Observable Markov Decision Process (POMDP). An exact solution to a POMDP problem provides the optimal balance between reward-seeking behaviour and information-seeking behaviour, in the presence of sensor and actuation noise. Unfortunately, previous exact and approximate solution methods have had difficulty scaling to real applications. The contribution of this thesis is the formulation of an approach to planning in the space of continuous parameterised approximations to probability distributions. Theoretical and practical results are presented which show that, when compared with similar methods from the literature, this approach is capable of scaling to larger and more realistic problems. In order to apply the solution algorithm to real-world problems, a number of novel improvements are proposed. Specifically, Monte Carlo methods are employed to estimate distributions over future parameterised beliefs, improving planning accuracy without a loss of efficiency. Conditional independence assumptions are exploited to simplify the problem, reducing computational requirements. Scalability is further increased by focussing computation on likely beliefs, using metric indexing structures for efficient function approximation. Local online planning is incorporated to assist global offline planning, allowing the precision of the latter to be decreased without adversely affecting solution quality. Finally, the algorithm is implemented and demonstrated during real-time control of a mobile robot in a challenging navigation task. We argue that this task is substantially more challenging and realistic than previous problems to which POMDP solution methods have been applied. Results show that POMDP planning, which considers the evolution of the entire probability distribution over robot poses, produces significantly more robust behaviour when compared with a heuristic planner which considers only the most likely states and outcomes
    corecore