Search CORE

124 research outputs found

MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs

Author: Charpillet Francois
Szer Daniel
Zilberstein Shlomo
Publication venue
Publication date: 01/01/2012
Field of study

We present multi-agent A* (MAA*), the first complete and optimal heuristic search algorithm for solving decentralized partially-observable Markov decision problems (DEC-POMDPs) with finite horizon. The algorithm is suitable for computing optimal plans for a cooperative group of agents that operate in a stochastic environment such as multirobot coordination, network traffic control, `or distributed resource allocation. Solving such problems efiectively is a major challenge in the area of planning under uncertainty. Our solution is based on a synthesis of classical heuristic search and decentralized control theory. Experimental results show that MAA* has significant advantages. We introduce an anytime variant of MAA* and conclude with a discussion of promising extensions such as an approach to solving infinite horizon problems.Comment: Appears in Proceedings of the Twenty-First Conference on Uncertainty in Artificial Intelligence (UAI2005

arXiv.org e-Print Archive

ScholarWorks@UMass Amherst

The MADP Toolbox: An Open-Source Library for Planning and Learning in (Multi-)Agent Systems

Author: Messias JV
Oliehoek FA
Robbel P
Spaan MTJ
Terwijn B
Publication venue
Publication date: 01/08/2017
Field of study

This article describes the MultiAgent Decision Process (MADP) toolbox, a software library to support planning and learning for intelligent agents and multiagent systems in un- certain environments. Some of its key features are that it sup- ports partially observable environments and stochastic tran- sition models; has unified support for single- and multiagent systems; provides a large number of models for decision- theoretic decision making, including one-shot decision mak- ing (e.g., Bayesian games) and sequential decision mak- ing under various assumptions of observability and coopera- tion, such as Dec-POMDPs and POSGs; provides tools and parsers to quickly prototype new problems; provides an ex- tensive range of planning and learning algorithms for single- and multiagent systems; and is written in C++ and designed to be extensible via the object-oriented paradigm

University of Liverpool Repository

TU Delft Repository

Value-Function Approximations for Partially Observable Markov Decision Processes

Author: Hauskrecht M.
Publication venue: 'AI Access Foundation'
Publication date: 01/06/2011
Field of study

Partially observable Markov decision processes (POMDPs) provide an elegant mathematical framework for modeling complex decision and planning problems in stochastic domains in which states of the system are observable only indirectly, via a set of imperfect or noisy observations. The modeling advantage of POMDPs, however, comes at a price -- exact methods for solving them are computationally very expensive and thus applicable in practice only to very simple problems. We focus on efficient approximation (heuristic) methods that attempt to alleviate the computational problem and trade off accuracy for speed. We have two objectives here. First, we survey various approximation methods, analyze their properties and relations and provide some new insights into their differences. Second, we present a number of new approximation methods and novel refinements of existing techniques. The theoretical results are supported by experiments on a problem from the agent navigation domain

arXiv.org e-Print Archive

Crossref

Computing Convex Coverage Sets for Faster Multi-objective Coordination

Author: Oliehoek Frans
Roijers Diederik M
Whiteson Shimon
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2015
Field of study

In this article, we propose new algorithms for multi-objective coordination graphs (MO- CoGs). Key to the efficiency of these algorithms is that they compute a convex coverage set (CCS) instead of a Pareto coverage set (PCS). Not only is a CCS a sufficient solution set for a large class of problems, it also has important characteristics that facilitate more efficient solutions. We propose two main algorithms for computing a CCS in MO-CoGs. Convex multi-objective variable elimination (CMOVE) computes a CCS by performing a series of agent eliminations, which can be seen as solving a series of local multi-objective subproblems. Variable elimination linear support (VELS) iteratively identifies the single weight vector w that can lead to the maximal possible improvement on a partial CCS and calls variable elimination to solve a scalarized instance of the problem for w. VELS is faster than CMOVE for small and medium numbers of objectives and can compute an ε-approximate CCS in a fraction of the runtime. In addition, we propose variants of these methods that employ AND/OR tree search instead of variable elimination to achieve memory efficiency. We analyze the runtime and space complexities of these methods, prove their correctness, and compare them empirically against a naive baseline and an existing PCS method, both in terms of memory-usage and runtime. Our results show that, by focusing on the CCS, these methods achieve much better scalability in the number of agents than the current state of the art

University of Liverpool Repository

CiteSeerX

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Mixed Integer Linear Programming For Exact Finite-Horizon Planning In Decentralized Pomdps

Author: Aras Raghav
Charpillet François
Dutech Alain
Publication venue
Publication date: 17/07/2007
Field of study

We consider the problem of finding an n-agent joint-policy for the optimal finite-horizon control of a decentralized Pomdp (Dec-Pomdp). This is a problem of very high complexity (NEXP-hard in n >= 2). In this paper, we propose a new mathematical programming approach for the problem. Our approach is based on two ideas: First, we represent each agent's policy in the sequence-form and not in the tree-form, thereby obtaining a very compact representation of the set of joint-policies. Second, using this compact representation, we solve this problem as an instance of combinatorial optimization for which we formulate a mixed integer linear program (MILP). The optimal solution of the MILP directly yields an optimal joint-policy for the Dec-Pomdp. Computational experience shows that formulating and solving the MILP requires significantly less time to solve benchmark Dec-Pomdp problems than existing algorithms. For example, the multi-agent tiger problem for horizon 4 is solved in 72 secs with the MILP whereas existing algorithms require several hours to solve it

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

Contingent planning under uncertainty via stochastic satisfiability

Author: Littman Michael L.
Majercik Stephen M.
Publication venue: Bowdoin Digital Commons
Publication date: 01/07/2003
Field of study

We describe a new planning technique that efficiently solves probabilistic propositional contingent planning problems by converting them into instances of stochastic satisfiability (SSAT) and solving these problems instead. We make fundamental contributions in two areas: the solution of SSAT problems and the solution of stochastic planning problems. This is the first work extending the planning-as-satisfiability paradigm to stochastic domains. Our planner, ZANDER, can solve arbitrary, goal-oriented, finite-horizon partially observable Markov decision processes (POMDPs). An empirical study comparing ZANDER to seven other leading planners shows that its performance is competitive on a range of problems. © 2003 Elsevier Science B.V. All rights reserved

Bowdoin College

Elsevier - Publisher Connector

Computing Convex Coverage Sets for Multi-Objective Coordination Graphs

Author: Oliehoek F.A.
Roijers D.M.
Whiteson S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

International Migration, Integration and Social Cohesion online publications

Optimal and Approximate Q-value Functions for Decentralized POMDPs

Author: Oliehoek Frans A.
Spaan Matthijs T. J.
Vlassis Nikos
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2008
Field of study

Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out by resorting to Q-value functions: an optimal Q-value function Q* is computed in a recursive manner by dynamic programming, and then an optimal policy is extracted from Q*. In this paper we study whether similar Q-value functions can be defined for decentralized POMDP models (Dec-POMDPs), and how policies can be extracted from such value functions. We define two forms of the optimal Q-value function for Dec-POMDPs: one that gives a normative description as the Q-value function of an optimal pure joint policy and another one that is sequentially rational and thus gives a recipe for computation. This computation, however, is infeasible for all but the smallest problems. Therefore, we analyze various approximate Q-value functions that allow for efficient computation. We describe how they relate, and we prove that they all provide an upper bound to the optimal Q-value function Q*. Finally, unifying some previous approaches for solving Dec-POMDPs, we describe a family of algorithms for extracting policies from such Q-value functions, and perform an experimental evaluation on existing test problems, including a new firefighting benchmark problem

arXiv.org e-Print Archive

CiteSeerX

University of Liverpool Repository

Crossref

Open Repository and Bibliography - Luxembourg

UvA-DARE

International Migration, Integration and Social Cohesion online publications

MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs

Author: Charpillet François
Szer Daniel
Zilberstein Shlomo
Publication venue: HAL CCSD
Publication date: 26/07/2005
Field of study

We present multi-agent A* (MAA*), the first complete and optimal heuristic search algorithm for solving decentralized partially-observable Markov decision problems (DEC-POMDPs) with finite horizon. The algorithm is suitable for computing optimal plans for a cooperative group of agents that operate in a stochastic environment such as multi-robot coordination, network traffic control, or distributed resource allocation. Solving such problems effectively is a major challenge in the area of planning under uncertainty. Our solution is based on a synthesis of classical heuristic search and decentralized control theory. Experimental results show that MAA* has significant advantages. We introduce an anytime variant of MAA* and conclude with a discussion of promising extensions such as an approach to solving infinite horizon problems

INRIA a CCSD electronic archive server

Decision-Theoretic Planning Under Anonymity in Agent Populations

Author: Chen Yingke
Doshi Prashant
Sonu Ekhlas
Publication venue: 'AI Access Foundation'
Publication date: 24/08/2017
Field of study

Crossref

Teeside University's Research Repository