10,843 research outputs found
Task-Based Information Compression for Multi-Agent Communication Problems with Channel Rate Constraints
A collaborative task is assigned to a multiagent system (MAS) in which agents
are allowed to communicate. The MAS runs over an underlying Markov decision
process and its task is to maximize the averaged sum of discounted one-stage
rewards. Although knowing the global state of the environment is necessary for
the optimal action selection of the MAS, agents are limited to individual
observations. The inter-agent communication can tackle the issue of local
observability, however, the limited rate of the inter-agent communication
prevents the agent from acquiring the precise global state information. To
overcome this challenge, agents need to communicate their observations in a
compact way such that the MAS compromises the minimum possible sum of rewards.
We show that this problem is equivalent to a form of rate-distortion problem
which we call the task-based information compression. We introduce a scheme for
task-based information compression titled State aggregation for information
compression (SAIC), for which a state aggregation algorithm is analytically
designed. The SAIC is shown to be capable of achieving near-optimal performance
in terms of the achieved sum of discounted rewards. The proposed algorithm is
applied to a rendezvous problem and its performance is compared with several
benchmarks. Numerical experiments confirm the superiority of the proposed
algorithm.Comment: 13 pages, 9 figure
Decentralization of Multiagent Policies by Learning What to Communicate
Effective communication is required for teams of robots to solve
sophisticated collaborative tasks. In practice it is typical for both the
encoding and semantics of communication to be manually defined by an expert;
this is true regardless of whether the behaviors themselves are bespoke,
optimization based, or learned. We present an agent architecture and training
methodology using neural networks to learn task-oriented communication
semantics based on the example of a communication-unaware expert policy. A
perimeter defense game illustrates the system's ability to handle dynamically
changing numbers of agents and its graceful degradation in performance as
communication constraints are tightened or the expert's observability
assumptions are broken.Comment: 7 page
Exploiting Anonymity in Approximate Linear Programming: Scaling to Large Multiagent MDPs (Extended Version)
Many exact and approximate solution methods for Markov Decision Processes
(MDPs) attempt to exploit structure in the problem and are based on
factorization of the value function. Especially multiagent settings, however,
are known to suffer from an exponential increase in value component sizes as
interactions become denser, meaning that approximation architectures are
restricted in the problem sizes and types they can handle. We present an
approach to mitigate this limitation for certain types of multiagent systems,
exploiting a property that can be thought of as "anonymous influence" in the
factored MDP. Anonymous influence summarizes joint variable effects efficiently
whenever the explicit representation of variable identity in the problem can be
avoided. We show how representational benefits from anonymity translate into
computational efficiencies, both for general variable elimination in a factor
graph but in particular also for the approximate linear programming solution to
factored MDPs. The latter allows to scale linear programming to factored MDPs
that were previously unsolvable. Our results are shown for the control of a
stochastic disease process over a densely connected graph with 50 nodes and 25
agents.Comment: Extended version of AAAI 2016 pape
- …