371 research outputs found
Influence-Optimistic Local Values for Multiagent Planning --- Extended Version
Recent years have seen the development of methods for multiagent planning
under uncertainty that scale to tens or even hundreds of agents. However, most
of these methods either make restrictive assumptions on the problem domain, or
provide approximate solutions without any guarantees on quality. Methods in the
former category typically build on heuristic search using upper bounds on the
value function. Unfortunately, no techniques exist to compute such upper bounds
for problems with non-factored value functions. To allow for meaningful
benchmarking through measurable quality guarantees on a very general class of
problems, this paper introduces a family of influence-optimistic upper bounds
for factored decentralized partially observable Markov decision processes
(Dec-POMDPs) that do not have factored value functions. Intuitively, we derive
bounds on very large multiagent planning problems by subdividing them in
sub-problems, and at each of these sub-problems making optimistic assumptions
with respect to the influence that will be exerted by the rest of the system.
We numerically compare the different upper bounds and demonstrate how we can
achieve a non-trivial guarantee that a heuristic solution for problems with
hundreds of agents is close to optimal. Furthermore, we provide evidence that
the upper bounds may improve the effectiveness of heuristic influence search,
and discuss further potential applications to multiagent planning.Comment: Long version of IJCAI 2015 paper (and extended abstract at AAMAS
2015
Scalable Planning and Learning for Multiagent POMDPs: Extended Version
Online, sample-based planning algorithms for POMDPs have shown great promise
in scaling to problems with large state spaces, but they become intractable for
large action and observation spaces. This is particularly problematic in
multiagent POMDPs where the action and observation space grows exponentially
with the number of agents. To combat this intractability, we propose a novel
scalable approach based on sample-based planning and factored value functions
that exploits structure present in many multiagent settings. This approach
applies not only in the planning case, but also in the Bayesian reinforcement
learning setting. Experimental results show that we are able to provide high
quality solutions to large multiagent planning and learning problems
Exploiting Anonymity in Approximate Linear Programming: Scaling to Large Multiagent MDPs (Extended Version)
Many exact and approximate solution methods for Markov Decision Processes
(MDPs) attempt to exploit structure in the problem and are based on
factorization of the value function. Especially multiagent settings, however,
are known to suffer from an exponential increase in value component sizes as
interactions become denser, meaning that approximation architectures are
restricted in the problem sizes and types they can handle. We present an
approach to mitigate this limitation for certain types of multiagent systems,
exploiting a property that can be thought of as "anonymous influence" in the
factored MDP. Anonymous influence summarizes joint variable effects efficiently
whenever the explicit representation of variable identity in the problem can be
avoided. We show how representational benefits from anonymity translate into
computational efficiencies, both for general variable elimination in a factor
graph but in particular also for the approximate linear programming solution to
factored MDPs. The latter allows to scale linear programming to factored MDPs
that were previously unsolvable. Our results are shown for the control of a
stochastic disease process over a densely connected graph with 50 nodes and 25
agents.Comment: Extended version of AAAI 2016 pape
Scaling POMDPs For Selecting Sellers in E-markets-Extended Version
In multiagent e-marketplaces, buying agents need to select good sellers by querying other buyers (called advisors). Partially Observable Markov Decision Processes (POMDPs) have shown to be an effective framework for optimally selecting sellers by selectively querying advisors. However, current solution methods do not scale to hundreds or even tens of agents operating in the e-market. In this paper, we propose the Mixture of POMDP Experts (MOPE) technique, which exploits the inherent structure of trust-based domains, such as the seller selection problem in e-markets, by aggregating the solutions of smaller sub-POMDPs. We propose a number of variants of the MOPE approach that we analyze theoretically and empirically. Experiments show that MOPE can scale up to a hundred agents thereby leveraging the presence of more advisors to significantly improve buyer satisfaction
Integrating Human-Provided Information Into Belief State Representation Using Dynamic Factorization
In partially observed environments, it can be useful for a human to provide
the robot with declarative information that represents probabilistic relational
constraints on properties of objects in the world, augmenting the robot's
sensory observations. For instance, a robot tasked with a search-and-rescue
mission may be informed by the human that two victims are probably in the same
room. An important question arises: how should we represent the robot's
internal knowledge so that this information is correctly processed and combined
with raw sensory information? In this paper, we provide an efficient belief
state representation that dynamically selects an appropriate factoring,
combining aspects of the belief when they are correlated through information
and separating them when they are not. This strategy works in open domains, in
which the set of possible objects is not known in advance, and provides
significant improvements in inference time over a static factoring, leading to
more efficient planning for complex partially observed tasks. We validate our
approach experimentally in two open-domain planning problems: a 2D discrete
gridworld task and a 3D continuous cooking task. A supplementary video can be
found at http://tinyurl.com/chitnis-iros-18.Comment: IROS 2018 final versio
A Scalable Framework to Choose Sellers in E-Marketplaces Using POMDPs
In multiagent e-marketplaces, buying agents need to select good sellers by querying other buyers (called advisors). Partially Observable Markov Decision Processes (POMDPs) have shown to be an effective framework for optimally selecting sellers by selectively querying advisors. However, current solution methods do not scale to hundreds or even tens of agents operating in the e-market. In this paper, we propose the Mixture of POMDP Experts (MOPE) technique, which exploits the inherent structure of trust-based domains, such as the seller selection problem in e-markets, by aggregating the solutions of smaller sub-POMDPs. We propose a number of variants of the MOPE approach that we analyze theoretically and empirically. Experiments show that MOPE can scale up to a hundred agents thereby leveraging the presence of more advisors to significantly improve buyer satisfaction
- …