Search CORE

403 research outputs found

Solving Continuous-State POMDPs via Density Projection

Author: Fu Michael C.
Marcus Steven I.
Zhou Enlu
Publication venue
Publication date: 01/01/2007
Field of study

Research on numerical solution methods for partially observable Markov decision processes (POMDPs) has primarily focused on discrete-state models, and these algorithms do not generally extend to continuous-state POMDPs, due to the infinite dimensionality of the belief space. In this paper, we develop a computationally viable and theoretically sound method for solving continuous-state POMDPs by effectively reducing the dimensionality of the belief space via density projections. The density projection technique is also incorporated into particle filtering to provide a filtering scheme for online decision making. We provide an error bound between the value function induced by the policy obtained by our method and the true value function of the POMDP, and also an error bound between the projection particle filtering and the optimal filtering. Finally, we illustrate the effectiveness of our method through an inventory control problem

Digital Repository at the University of Maryland

Verification of Uncertain POMDPs Using Barrier Certificates

Author: Ahmadi Mohamadreza
Cubuktepe Murat
Jansen Nils
Topcu Ufuk
Publication venue
Publication date: 10/07/2018
Field of study

We consider a class of partially observable Markov decision processes (POMDPs) with uncertain transition and/or observation probabilities. The uncertainty takes the form of probability intervals. Such uncertain POMDPs can be used, for example, to model autonomous agents with sensors with limited accuracy, or agents undergoing a sudden component failure, or structural damage [1]. Given an uncertain POMDP representation of the autonomous agent, our goal is to propose a method for checking whether the system will satisfy an optimal performance, while not violating a safety requirement (e.g. fuel level, velocity, and etc.). To this end, we cast the POMDP problem into a switched system scenario. We then take advantage of this switched system characterization and propose a method based on barrier certificates for optimality and/or safety verification. We then show that the verification task can be carried out computationally by sum-of-squares programming. We illustrate the efficacy of our method by applying it to a Mars rover exploration example.Comment: 8 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Compact Representation of Value Function in Partially Observable Stochastic Games

Author: Bošanský Branislav
Horák Karel
Kamhoua Charles
Kiekintveld Christopher
Publication venue
Publication date: 13/03/2019
Field of study

Value methods for solving stochastic games with partial observability model the uncertainty about states of the game as a probability distribution over possible states. The dimension of this belief space is the number of states. For many practical problems, for example in security, there are exponentially many possible states which causes an insufficient scalability of algorithms for real-world problems. To this end, we propose an abstraction technique that addresses this issue of the curse of dimensionality by projecting high-dimensional beliefs to characteristic vectors of significantly lower dimension (e.g., marginal probabilities). Our two main contributions are (1) novel compact representation of the uncertainty in partially observable stochastic games and (2) novel algorithm based on this compact representation that is based on existing state-of-the-art algorithms for solving stochastic games with partial observability. Experimental evaluation confirms that the new algorithm over the compact representation dramatically increases the scalability compared to the state of the art

arXiv.org e-Print Archive

Crossref

Anytime Point-Based Approximations for Large POMDPs

Author: Gordon G.
Pineau J.
Thrun S.
Publication venue: 'AI Access Foundation'
Publication date: 04/10/2011
Field of study

The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A well-known technique for speeding up POMDP solving involves performing value backups at specific belief points, rather than over the entire belief simplex. The efficiency of this approach, however, depends greatly on the selection of points. This paper presents a set of novel techniques for selecting informative belief points which work well in practice. The point selection procedure is combined with point-based value backups to form an effective anytime POMDP algorithm called Point-Based Value Iteration (PBVI). The first aim of this paper is to introduce this algorithm and present a theoretical analysis justifying the choice of belief selection technique. The second aim of this paper is to provide a thorough empirical comparison between PBVI and other state-of-the-art POMDP methods, in particular the Perseus algorithm, in an effort to highlight their similarities and differences. Evaluation is performed using both standard POMDP domains and realistic robotic tasks

arXiv.org e-Print Archive

Crossref