Search CORE

5 research outputs found

Decentralized multi-agent reinforcement learning in average-reward dynamic DCOPs

Author: LAU Hoong Chuin
Nguyen Duc Thien
YEOH William
ZHANG Chongjie
Zilberstein Shlomo
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 01/05/2014
Field of study

Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multi-agent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in other time steps, which might not hold in some applications. Therefore, in this paper, we make the following contributions: (i) We introduce a new model, called Markovian Dynamic DCOPs (MD-DCOPs), where the DCOP in the next time step is a function of the value assignments in the current time step; (ii) We introduce two distributed reinforcement learning algorithms, the Distributed RVI Q-learning algorithm and the Distributed R-learning algorithm, that balance exploration and exploitation to solve MD-DCOPs in an online manner; and (iii) We empirically evaluate them against an existing multi-arm bandit DCOP algorithm on dynamic DCOPs

Institutional Knowledge at Singapore Management University

Association for the Advancement of Artificial Intelligence: AAAI Publications

Incremental DCOP Search Algorithms for Solving Dynamic DCOP Problems

Author: KOENIG Sven
Pradeep VARAKANTHAM
SUN Xiaoxun
YEOH William
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2015
Field of study

Crossref

Institutional Knowledge at Singapore Management University

Incremental DCOP Search Algorithms for Solving Dynamic DCOP Problems

Author: KOENIG Sven
SUN Xiaoxun
VARAKANTHAM Pradeep
YEOH William
Publication venue: IFAAMAS
Publication date: 01/01/2011
Field of study

Distributed constraint optimization problems (DCOPs) are wellsuited for modeling multi-agent coordination problems. However, most research has focused on developing algorithms for solving static DCOPs. In this paper, we model dynamic DCOPs as sequences of (static) DCOPs with changes from one DCOP to the next one in the sequence. We introduce the ReuseBounds procedure, which can be used by any-space ADOPT and any-space BnB-ADOPT to find cost-minimal solutions for all DCOPs in the sequence faster than by solving each DCOP individually. This procedure allows those agents that are guaranteed to remain unaffected by a change to reuse their lower and upper bounds from the previous DCOP when solving the next one in the sequence. Our experimental results show that the speedup gained from this procedure increases with the amount of memory the agents have available

CiteSeerX

Institutional Knowledge at Singapore Management University

Automatic construction, maintenance, and optimization of dynamic agent organizations

Author: Sultanik Evan Andrew
Publication venue: Drexel University
Publication date
Field of study

The goal of this dissertation is to generate organizational structures that increase the overall performance of a multiagent coalition, subject to the system's complex coordination requirements and maintenance of a certain operating point. To this end, a generalized framework capable of producing distributed approximation algorithms based on the new concept of multidirectional graph search is proposed and applied to a family of connectivity problems. It is shown that a wide variety of seemingly unrelated multiagent organization problems live within this family. Su cient conditions are identi ed in which the approach is guaranteed to discover a solution that is within a constant factor of the cost of the optimal solution. The procedure is guaranteed to require no more than linear|and in some well de ned cases logarithmic|communication rounds. A number of examples are given as to how the framework can be applied to create, maintain, and optimize multiagent organizations in the context of real world problems. Finally, algorithmic extensions are introduced that allow for the framework to handle problems in which the agent topology and/or coordination constraints are dynamic, without signi cant consequences to the general runtime, memory, and quality guarantees.Ph.D., Computer Science -- Drexel University, 201

Drexel Libraries E-Repository and Archives