Search CORE

64 research outputs found

Producing efficient error-bounded solutions for transition independent decentralized MDPs

Author: Amato Christopher
Charpillet François
Dibangoye Jilles Steeve
Doniec Arnaud
Publication venue: HAL CCSD
Publication date: 06/05/2013
Field of study

pages 539-546International audienceThere has been substantial progress on algorithms for single-agent sequential decision making problems represented as partially observable Markov decision processes (POMDPs). A number of efficient algorithms for solving POMDPs share two desirable properties: error-bounds and fast convergence rates. Despite significant efforts, no algorithms for solving decentralized POMDPs benefit from these properties, leading to either poor solution quality or limited scalability. This paper presents the first approach for solving transition independent decentralized Markov decision processes (MDPs), that inherits these properties. Two related algorithms illustrate this approach. The first recasts the original problem as a finite-horizon deterministic and completely observable Markov decision process. In this form, the original problem is solved by combining heuristic search with constraint optimization to quickly converge into a near-optimal policy. This algorithm also provides the foundation for the first algorithm for solving infinite-horizon transition independent decentralized MDPs. We demonstrate that both methods outperform state-of-the-art algorithms by multiple orders of magnitude, and for infinite-horizon decentralized MDPs, the algorithm is able to construct more concise policies by searching cyclic policy graphs

INRIA a CCSD electronic archive server

A CASE FOR DOMAIN-INDEPENDENT DETERMINISTIC MULTIAGENT

Author: Štolba Michal
Publication venue: 'Czech Technical University in Prague - Central Library'
Publication date: 01/12/2015
Field of study

The notion of planning using multiple agents has been around since the very beginning of planning itself. It has been approached from various viewpoints especially in the multiagent systems community. Recently, domain-independent multiagent planning has gained more attention also in the automated planning community. In this paper, we shortly present the current state of the art, question some aspects of the research field and discuss the rising challenges

Directory of Open Access Journals

CTU Open Journal Systems (Czech Technical University, Prague / České vysoké učení technické v Praze)

Minimizing communication cost in a distributed Bayesian network using a decentralized MDP

Author: Jiaying Shen
Norman Carver
Victor Lesser
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

Crossref

Perseus: Randomized Point-based Value Iteration for POMDPs

Author: Spaan M. T. J.
Vlassis N.
Publication venue: 'AI Access Foundation'
Publication date: 09/09/2011
Field of study

Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Point-based approximate techniques for POMDPs compute a policy based on a finite set of points collected in advance from the agents belief space. We present a randomized point-based value iteration algorithm called Perseus. The algorithm performs approximate value backup stages, ensuring that in each backup stage the value of each point in the belief set is improved; the key observation is that a single backup may improve the value of many belief points. Contrary to other point-based methods, Perseus backs up only a (randomly selected) subset of points in the belief set, sufficient for improving the value of each belief point in the set. We show how the same idea can be extended to dealing with continuous action spaces. Experimental results show the potential of Perseus in large scale POMDP problems

arXiv.org e-Print Archive

Crossref

Influence-Optimistic Local Values for Multiagent Planning --- Extended Version

Author: Oliehoek Frans A.
Spaan Matthijs T. J.
Witwicki Stefan
Publication venue
Publication date: 20/07/2015
Field of study

Recent years have seen the development of methods for multiagent planning under uncertainty that scale to tens or even hundreds of agents. However, most of these methods either make restrictive assumptions on the problem domain, or provide approximate solutions without any guarantees on quality. Methods in the former category typically build on heuristic search using upper bounds on the value function. Unfortunately, no techniques exist to compute such upper bounds for problems with non-factored value functions. To allow for meaningful benchmarking through measurable quality guarantees on a very general class of problems, this paper introduces a family of influence-optimistic upper bounds for factored decentralized partially observable Markov decision processes (Dec-POMDPs) that do not have factored value functions. Intuitively, we derive bounds on very large multiagent planning problems by subdividing them in sub-problems, and at each of these sub-problems making optimistic assumptions with respect to the influence that will be exerted by the rest of the system. We numerically compare the different upper bounds and demonstrate how we can achieve a non-trivial guarantee that a heuristic solution for problems with hundreds of agents is close to optimal. Furthermore, we provide evidence that the upper bounds may improve the effectiveness of heuristic influence search, and discuss further potential applications to multiagent planning.Comment: Long version of IJCAI 2015 paper (and extended abstract at AAMAS 2015

arXiv.org e-Print Archive

University of Liverpool Repository

CiteSeerX

Message-Passing Algorithms for Large Structured Decentralized POMDPs

Author: KUMAR Akshat
ZILBERSTEIN Shlomo
Publication venue: IFAAMAS
Publication date: 01/05/2011
Field of study

Institutional Knowledge at Singapore Management University

ConTaCT: Deciding to Communicate during Time-Critical Collaborative Tasks in Unknown, Deterministic Domains

Author: Shah Julie A.
Unhelkar Vaibhav Vasant
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 20/01/2016
Field of study

Communication between agents has the potential to improve team performance of collaborative tasks. However, communication is not free in most domains, requiring agents to reason about the costs and benefits of sharing information. In this work, we develop an online, decentralized communication policy, ConTaCT, that enables agents to decide whether or not to communicate during time-critical collaborative tasks in unknown, deterministic environments. Our approach is motivated by real-world applications, including the coordination of disaster response and search and rescue teams. These settings motivate a model structure that explicitly represents the world model as initially unknown but deterministic in nature, and that de-emphasizes uncertainty about action outcomes. Simulated experiments are conducted in which ConTaCT is compared to other multi-agent communication policies, and results indicate that ConTaCT achieves comparable task performance while substantially reducing communication overhead

CiteSeerX

DSpace@MIT

Association for the Advancement of Artificial Intelligence: AAAI Publications