5 research outputs found

    Decentralized multi-agent reinforcement learning in average-reward dynamic DCOPs

    Get PDF
    Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multi-agent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in other time steps, which might not hold in some applications. Therefore, in this paper, we make the following contributions: (i) We introduce a new model, called Markovian Dynamic DCOPs (MD-DCOPs), where the DCOP in the next time step is a function of the value assignments in the current time step; (ii) We introduce two distributed reinforcement learning algorithms, the Distributed RVI Q-learning algorithm and the Distributed R-learning algorithm, that balance exploration and exploitation to solve MD-DCOPs in an online manner; and (iii) We empirically evaluate them against an existing multi-arm bandit DCOP algorithm on dynamic DCOPs

    Incremental DCOP Search Algorithms for Solving Dynamic DCOP Problems

    Get PDF
    Distributed constraint optimization problems (DCOPs) are wellsuited for modeling multi-agent coordination problems. However, most research has focused on developing algorithms for solving static DCOPs. In this paper, we model dynamic DCOPs as sequences of (static) DCOPs with changes from one DCOP to the next one in the sequence. We introduce the ReuseBounds procedure, which can be used by any-space ADOPT and any-space BnB-ADOPT to find cost-minimal solutions for all DCOPs in the sequence faster than by solving each DCOP individually. This procedure allows those agents that are guaranteed to remain unaffected by a change to reuse their lower and upper bounds from the previous DCOP when solving the next one in the sequence. Our experimental results show that the speedup gained from this procedure increases with the amount of memory the agents have available

    Automatic construction, maintenance, and optimization of dynamic agent organizations

    Get PDF
    The goal of this dissertation is to generate organizational structures that increase the overall performance of a multiagent coalition, subject to the system's complex coordination requirements and maintenance of a certain operating point. To this end, a generalized framework capable of producing distributed approximation algorithms based on the new concept of multidirectional graph search is proposed and applied to a family of connectivity problems. It is shown that a wide variety of seemingly unrelated multiagent organization problems live within this family. Su cient conditions are identi ed in which the approach is guaranteed to discover a solution that is within a constant factor of the cost of the optimal solution. The procedure is guaranteed to require no more than linear|and in some well de ned cases logarithmic|communication rounds. A number of examples are given as to how the framework can be applied to create, maintain, and optimize multiagent organizations in the context of real world problems. Finally, algorithmic extensions are introduced that allow for the framework to handle problems in which the agent topology and/or coordination constraints are dynamic, without signi cant consequences to the general runtime, memory, and quality guarantees.Ph.D., Computer Science -- Drexel University, 201
    corecore