Search CORE

389 research outputs found

Exploiting Anonymity in Approximate Linear Programming: Scaling to Large Multiagent MDPs (Extended Version)

Author: Kochenderfer Mykel J.
Oliehoek Frans A.
Robbel Philipp
Publication venue
Publication date: 29/11/2015
Field of study

Many exact and approximate solution methods for Markov Decision Processes (MDPs) attempt to exploit structure in the problem and are based on factorization of the value function. Especially multiagent settings, however, are known to suffer from an exponential increase in value component sizes as interactions become denser, meaning that approximation architectures are restricted in the problem sizes and types they can handle. We present an approach to mitigate this limitation for certain types of multiagent systems, exploiting a property that can be thought of as "anonymous influence" in the factored MDP. Anonymous influence summarizes joint variable effects efficiently whenever the explicit representation of variable identity in the problem can be avoided. We show how representational benefits from anonymity translate into computational efficiencies, both for general variable elimination in a factor graph but in particular also for the approximate linear programming solution to factored MDPs. The latter allows to scale linear programming to factored MDPs that were previously unsolvable. Our results are shown for the control of a stochastic disease process over a densely connected graph with 50 nodes and 25 agents.Comment: Extended version of AAAI 2016 pape

arXiv.org e-Print Archive

University of Liverpool Repository

Anytime Point-Based Approximations for Large POMDPs

Author: Gordon G.
Pineau J.
Thrun S.
Publication venue: 'AI Access Foundation'
Publication date: 04/10/2011
Field of study

The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A well-known technique for speeding up POMDP solving involves performing value backups at specific belief points, rather than over the entire belief simplex. The efficiency of this approach, however, depends greatly on the selection of points. This paper presents a set of novel techniques for selecting informative belief points which work well in practice. The point selection procedure is combined with point-based value backups to form an effective anytime POMDP algorithm called Point-Based Value Iteration (PBVI). The first aim of this paper is to introduce this algorithm and present a theoretical analysis justifying the choice of belief selection technique. The second aim of this paper is to provide a thorough empirical comparison between PBVI and other state-of-the-art POMDP methods, in particular the Perseus algorithm, in an effort to highlight their similarities and differences. Evaluation is performed using both standard POMDP domains and realistic robotic tasks

arXiv.org e-Print Archive

Crossref

Perseus: Randomized Point-based Value Iteration for POMDPs

Author: Spaan M. T. J.
Vlassis N.
Publication venue: 'AI Access Foundation'
Publication date: 09/09/2011
Field of study

Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Point-based approximate techniques for POMDPs compute a policy based on a finite set of points collected in advance from the agents belief space. We present a randomized point-based value iteration algorithm called Perseus. The algorithm performs approximate value backup stages, ensuring that in each backup stage the value of each point in the belief set is improved; the key observation is that a single backup may improve the value of many belief points. Contrary to other point-based methods, Perseus backs up only a (randomly selected) subset of points in the belief set, sufficient for improving the value of each belief point in the set. We show how the same idea can be extended to dealing with continuous action spaces. Experimental results show the potential of Perseus in large scale POMDP problems

arXiv.org e-Print Archive

Crossref

Decision-Making Under Uncertainty: Beyond Probabilities

Author: Badings Thom
Jansen Nils
Simão Thiago D.
Suilen Marnix
Publication venue
Publication date: 10/03/2023
Field of study

This position paper reflects on the state-of-the-art in decision-making under uncertainty. A classical assumption is that probabilities can sufficiently capture all uncertainty in a system. In this paper, the focus is on the uncertainty that goes beyond this classical interpretation, particularly by employing a clear distinction between aleatoric and epistemic uncertainty. The paper features an overview of Markov decision processes (MDPs) and extensions to account for partial observability and adversarial behavior. These models sufficiently capture aleatoric uncertainty but fail to account for epistemic uncertainty robustly. Consequently, we present a thorough overview of so-called uncertainty models that exhibit uncertainty in a more robust interpretation. We show several solution techniques for both discrete and continuous models, ranging from formal verification, over control-based abstractions, to reinforcement learning. As an integral part of this paper, we list and discuss several key challenges that arise when dealing with rich types of uncertainty in a model-based fashion

arXiv.org e-Print Archive

Probabilistic Inference Techniques for Scalable Multiagent Decision Making

Author: Akshat KUMAR
TOUSSAINT Marc
ZILBERSTEIN Shlomo
Publication venue: 'AI Access Foundation'
Publication date: 01/06/2015
Field of study

Decentralized POMDPs provide an expressive framework for multiagent sequential decision making. However, the complexity of these models—NEXP-Complete even for two agents—has limited their scalability. We present a promising new class of approxima-tion algorithms by developing novel connections between multiagent planning and machine learning. We show how the multiagent planning problem can be reformulated as inference in a mixture of dynamic Bayesian networks (DBNs). This planning-as-inference approach paves the way for the application of efficient inference techniques in DBNs to multiagent decision making. To further improve scalability, we identify certain conditions that are sufficient to extend the approach to multiagent systems with dozens of agents. Specifically, we show that the necessary inference within the expectation-maximization framework can be decomposed into processes that often involve a small subset of agents, thereby facilitating scalability. We further show that a number of existing multiagent planning models satisfy these conditions. Experiments on large planning benchmarks confirm the benefits of our approach in terms of runtime and scalability with respect to existing techniques

CiteSeerX

Institutional Knowledge at Singapore Management University

Nonapproximability Results for Partially Observable Markov Decision Processes

Author: Goldsmith J.
Lusena C.
Mundhenk M.
Publication venue: 'AI Access Foundation'
Publication date: 01/06/2011
Field of study

We show that for several variations of partially observable Markov decision processes, polynomial-time algorithms for finding control policies are unlikely to or simply don't have guarantees of finding policies within a constant factor or a constant summand of optimal. Here "unlikely" means "unless some complexity classes collapse," where the collapses considered are P=NP, P=PSPACE, or P=EXP. Until or unless these collapses are shown to hold, any control-policy designer must choose between such performance guarantees and efficient computation

arXiv.org e-Print Archive

Crossref

Exploiting Anonymity in Approximate Linear Programming: Scaling to Large Multiagent MDPs

Author: AAAI
Kochenderfer Mykel J
Oliehoek Frans A
Robbel Philipp
Publication venue
Publication date: 01/01/2016
Field of study

The Markov Decision Process (MDP) framework is a versatile method for addressing single and multiagent sequential decision making problems. Many exact and approximate solution methods attempt to exploit struc- ture in the problem and are based on value factoriza- tion. Especially multiagent settings (MAS), however, are known to suffer from an exponential increase in value component sizes as interactions become denser, meaning that approximation architectures are overly re- stricted in the problem sizes and types they can handle. We present an approach to mitigate this limitation for certain types of MASs, exploiting a property that can be thought of as ‘anonymous influence’ in the factored MDP. In particular, we show how anonymity can lead to representational and computational efficiencies, both for general variable elimination in a factor graph but also for the approximate linear programming solution to factored MDPs. The latter allows to scale linear pro- gramming to factored MDPs that were previously un- solvable. Our results are shown for a disease control do- main over a graph with 50 nodes that are each connected with up to 15 neighbors

University of Liverpool Repository

International Migration, Integration and Social Cohesion online publications

Association for the Advancement of Artificial Intelligence: AAAI Publications

Workshop on Rich Representations for Reinforcement Learning:Held in conjunction with the 22nd International Conference on Machine Learning, August 7, 2005, Bonn, Germany

Author: Driessens Kurt
Fern Alan
van Otterlo Martijn
Publication venue: University of Bonn
Publication date: 01/01/2005
Field of study

University of Twente Research Information