Search CORE

1,500 research outputs found

Restricted Value Iteration: Theory and Algorithms

Author: Zhang N. L.
Zhang W.
Publication venue: 'AI Access Foundation'
Publication date: 30/06/2011
Field of study

Value iteration is a popular algorithm for finding near optimal policies for POMDPs. It is inefficient due to the need to account for the entire belief space, which necessitates the solution of large numbers of linear programs. In this paper, we study value iteration restricted to belief subsets. We show that, together with properly chosen belief subsets, restricted value iteration yields near-optimal policies and we give a condition for determining whether a given belief subset would bring about savings in space and time. We also apply restricted value iteration to two interesting classes of POMDPs, namely informative POMDPs and near-discernible POMDPs

arXiv.org e-Print Archive

Decision-Theoretic Planning with Person Trajectory Prediction for Social Navigation

Author: Caballero Fernando
Capitán Jesús
Merino Luis
Pérez Hurtado de Mendoza Ignacio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Robots navigating in a social way should reason about people intentions when acting. For instance, in applications like robot guidance or meeting with a person, the robot has to consider the goals of the people. Intentions are inherently nonobservable, and thus we propose Partially Observable Markov Decision Processes (POMDPs) as a decision-making tool for these applications. One of the issues with POMDPs is that the prediction models are usually handcrafted. In this paper, we use machine learning techniques to build prediction models from observations. A novel technique is employed to discover points of interest (goals) in the environment, and a variant of Growing Hidden Markov Models (GHMMs) is used to learn the transition probabilities of the POMDP. The approach is applied to an autonomous telepresence robot

Influence-Optimistic Local Values for Multiagent Planning --- Extended Version

Author: Oliehoek Frans A.
Spaan Matthijs T. J.
Witwicki Stefan
Publication venue
Publication date: 20/07/2015
Field of study

Recent years have seen the development of methods for multiagent planning under uncertainty that scale to tens or even hundreds of agents. However, most of these methods either make restrictive assumptions on the problem domain, or provide approximate solutions without any guarantees on quality. Methods in the former category typically build on heuristic search using upper bounds on the value function. Unfortunately, no techniques exist to compute such upper bounds for problems with non-factored value functions. To allow for meaningful benchmarking through measurable quality guarantees on a very general class of problems, this paper introduces a family of influence-optimistic upper bounds for factored decentralized partially observable Markov decision processes (Dec-POMDPs) that do not have factored value functions. Intuitively, we derive bounds on very large multiagent planning problems by subdividing them in sub-problems, and at each of these sub-problems making optimistic assumptions with respect to the influence that will be exerted by the rest of the system. We numerically compare the different upper bounds and demonstrate how we can achieve a non-trivial guarantee that a heuristic solution for problems with hundreds of agents is close to optimal. Furthermore, we provide evidence that the upper bounds may improve the effectiveness of heuristic influence search, and discuss further potential applications to multiagent planning.Comment: Long version of IJCAI 2015 paper (and extended abstract at AAMAS 2015

arXiv.org e-Print Archive

University of Liverpool Repository

CiteSeerX