596 research outputs found
Decentralized Control of Partially Observable Markov Decision Processes using Belief Space Macro-actions
The focus of this paper is on solving multi-robot planning problems in
continuous spaces with partial observability. Decentralized partially
observable Markov decision processes (Dec-POMDPs) are general models for
multi-robot coordination problems, but representing and solving Dec-POMDPs is
often intractable for large problems. To allow for a high-level representation
that is natural for multi-robot problems and scalable to large discrete and
continuous problems, this paper extends the Dec-POMDP model to the
decentralized partially observable semi-Markov decision process (Dec-POSMDP).
The Dec-POSMDP formulation allows asynchronous decision-making by the robots,
which is crucial in multi-robot domains. We also present an algorithm for
solving this Dec-POSMDP which is much more scalable than previous methods since
it can incorporate closed-loop belief space macro-actions in planning. These
macro-actions are automatically constructed to produce robust solutions. The
proposed method's performance is evaluated on a complex multi-robot package
delivery problem under uncertainty, showing that our approach can naturally
represent multi-robot problems and provide high-quality solutions for
large-scale problems
Improving Automated Driving through Planning with Human Internal States
This work examines the hypothesis that partially observable Markov decision
process (POMDP) planning with human driver internal states can significantly
improve both safety and efficiency in autonomous freeway driving. We evaluate
this hypothesis in a simulated scenario where an autonomous car must safely
perform three lane changes in rapid succession. Approximate POMDP solutions are
obtained through the partially observable Monte Carlo planning with observation
widening (POMCPOW) algorithm. This approach outperforms over-confident and
conservative MDP baselines and matches or outperforms QMDP. Relative to the MDP
baselines, POMCPOW typically cuts the rate of unsafe situations in half or
increases the success rate by 50%.Comment: Preprint before submission to IEEE Transactions on Intelligent
Transportation Systems. arXiv admin note: text overlap with arXiv:1702.0085
Anytime Point-Based Approximations for Large POMDPs
The Partially Observable Markov Decision Process has long been recognized as
a rich framework for real-world planning and control problems, especially in
robotics. However exact solutions in this framework are typically
computationally intractable for all but the smallest problems. A well-known
technique for speeding up POMDP solving involves performing value backups at
specific belief points, rather than over the entire belief simplex. The
efficiency of this approach, however, depends greatly on the selection of
points. This paper presents a set of novel techniques for selecting informative
belief points which work well in practice. The point selection procedure is
combined with point-based value backups to form an effective anytime POMDP
algorithm called Point-Based Value Iteration (PBVI). The first aim of this
paper is to introduce this algorithm and present a theoretical analysis
justifying the choice of belief selection technique. The second aim of this
paper is to provide a thorough empirical comparison between PBVI and other
state-of-the-art POMDP methods, in particular the Perseus algorithm, in an
effort to highlight their similarities and differences. Evaluation is performed
using both standard POMDP domains and realistic robotic tasks
- …