596 research outputs found

    Decentralized Control of Partially Observable Markov Decision Processes using Belief Space Macro-actions

    Get PDF
    The focus of this paper is on solving multi-robot planning problems in continuous spaces with partial observability. Decentralized partially observable Markov decision processes (Dec-POMDPs) are general models for multi-robot coordination problems, but representing and solving Dec-POMDPs is often intractable for large problems. To allow for a high-level representation that is natural for multi-robot problems and scalable to large discrete and continuous problems, this paper extends the Dec-POMDP model to the decentralized partially observable semi-Markov decision process (Dec-POSMDP). The Dec-POSMDP formulation allows asynchronous decision-making by the robots, which is crucial in multi-robot domains. We also present an algorithm for solving this Dec-POSMDP which is much more scalable than previous methods since it can incorporate closed-loop belief space macro-actions in planning. These macro-actions are automatically constructed to produce robust solutions. The proposed method's performance is evaluated on a complex multi-robot package delivery problem under uncertainty, showing that our approach can naturally represent multi-robot problems and provide high-quality solutions for large-scale problems

    Improving Automated Driving through Planning with Human Internal States

    Full text link
    This work examines the hypothesis that partially observable Markov decision process (POMDP) planning with human driver internal states can significantly improve both safety and efficiency in autonomous freeway driving. We evaluate this hypothesis in a simulated scenario where an autonomous car must safely perform three lane changes in rapid succession. Approximate POMDP solutions are obtained through the partially observable Monte Carlo planning with observation widening (POMCPOW) algorithm. This approach outperforms over-confident and conservative MDP baselines and matches or outperforms QMDP. Relative to the MDP baselines, POMCPOW typically cuts the rate of unsafe situations in half or increases the success rate by 50%.Comment: Preprint before submission to IEEE Transactions on Intelligent Transportation Systems. arXiv admin note: text overlap with arXiv:1702.0085

    Anytime Point-Based Approximations for Large POMDPs

    Full text link
    The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A well-known technique for speeding up POMDP solving involves performing value backups at specific belief points, rather than over the entire belief simplex. The efficiency of this approach, however, depends greatly on the selection of points. This paper presents a set of novel techniques for selecting informative belief points which work well in practice. The point selection procedure is combined with point-based value backups to form an effective anytime POMDP algorithm called Point-Based Value Iteration (PBVI). The first aim of this paper is to introduce this algorithm and present a theoretical analysis justifying the choice of belief selection technique. The second aim of this paper is to provide a thorough empirical comparison between PBVI and other state-of-the-art POMDP methods, in particular the Perseus algorithm, in an effort to highlight their similarities and differences. Evaluation is performed using both standard POMDP domains and realistic robotic tasks
    • …
    corecore