11 research outputs found

    Embedding Preference Elicitation Within the Search for DCOP Solutions

    Get PDF
    The Distributed Constraint Optimization Problem(DCOP)formulation is a powerful tool to model cooperative multi-agent problems, especially when they are sparsely constrained with one another. A key assumption in this model is that all constraints are fully specified or known a priori, which may not hold in applications where constraints encode preferences of human users. In this thesis, we extend the model to Incomplete DCOPs (I-DCOPs), where some constraints can be partially specified. User preferences for these partially-specified constraints can be elicited during the execution of I-DCOP algorithms, but they incur some elicitation costs. Additionally, we propose two parameterized heuristics that can be used in conjunction with Synchronous Branch-and-Bound to solve I-DCOPs. These heuristics allow users to trade-off solution quality for faster runtimes and a smaller number of elicitations. They also provide theoretical quality guarantees for problems where elicitations are free. Our model and heuristics thus extend the state of the art in distributed constraint reasoning to better model and solve distributed agent-based applications with user preferences

    Decentralized multi-agent reinforcement learning in average-reward dynamic DCOPs

    Get PDF
    Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multi-agent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in other time steps, which might not hold in some applications. Therefore, in this paper, we make the following contributions: (i) We introduce a new model, called Markovian Dynamic DCOPs (MD-DCOPs), where the DCOP in the next time step is a function of the value assignments in the current time step; (ii) We introduce two distributed reinforcement learning algorithms, the Distributed RVI Q-learning algorithm and the Distributed R-learning algorithm, that balance exploration and exploitation to solve MD-DCOPs in an online manner; and (iii) We empirically evaluate them against an existing multi-arm bandit DCOP algorithm on dynamic DCOPs

    Distributed Gibbs: A memory-bounded sampling-based DCOP algorithm

    Get PDF
    National Research Foundation (NRF) Singapore under International Research Centres in Singapore Funding Initiativ

    Development of an Entropy-Based Swarm Algorithm for Continuous Dynamic Constrained Optimization

    Get PDF
    Dynamic constrained optimization problems form a class of problems WHERE the objective function or the constraints can change over time. In static optimization, finding a global optimum is considered as the main goal. In dynamic optimization, the goal is not only to find an optimal solution, but also track its trajectory as closely as possible over time. Changes in the environment must be taken into account during the optimization process in such way that these problems are to be solved online. Many real-world problems can be formulated within this framework. This thesis proposes an entropy-based bare bones particle swarm for solving dynamic constrained optimization problems. The Shannons entropy is established as a phenotypic diversity index and the proposed algorithm uses the Shannons index of diversity to aggregate the global-best and local-best bare bones particle swarm variants. The proposed approach applies the idea of mixture of search directions by using the index of diversity as a factor to balance the influence of the global-best and local-best search directions. High diversity promotes the search guided by the global-best solution, with a normal distribution for exploitation. Low diversity promotes the search guided by the local-best solution, with a heavy-tailed distribution for exploration. A constraint-handling strategy is also proposed, which uses a ranking method with selection based on the technique for order of preference by similarity to ideal solution to obtain the best solution within a specific population of candidate solutions. Mechanisms to detect changes in the environment and to update particles' memories are also implemented into the proposed algorithm. All these strategies do not act independently. They operate related to each other to tackle problems such as: diversity loss due to convergence and outdated memories due to changes in the environment. The combined effect of these strategies provides an algorithm with ability to maintain a proper balance between exploration and exploitation at any stage of the search process without losing the tracking ability to search an optimal solution which is changing over time. An empirical study was carried out to evaluate the performance of the proposed approach. Experimental results show the suitability of the algorithm in terms of effectiveness to find good solutions for the benchmark problems investigated. Finally, an application is developed, WHERE the proposed algorithm is applied to solve the dynamic economic dispatch problem in power systems

    Distributed Gibbs: A linear-space sampling-based DCOP algorithm

    Get PDF
    National Research Foundation (NRF) Singapore under International Research Centres in Singapore Funding Initiativ

    Managing distributed situation awareness in a team of agents

    Get PDF
    The research presented in this thesis investigates the best ways to manage Distributed Situation Awareness (DSA) for a team of agents tasked to conduct search activity with limited resources (battery life, memory use, computational power, etc.). In the first part of the thesis, an algorithm to coordinate agents (e.g., UAVs) is developed. This is based on Delaunay triangulation with the aim of supporting efficient, adaptable, scalable, and predictable search. Results from simulation and physical experiments with UAVs show good performance in terms of resources utilisation, adaptability, scalability, and predictability of the developed method in comparison with the existing fixed-pattern, pseudorandom, and hybrid methods. The second aspect of the thesis employs Bayesian Belief Networks (BBNs) to define and manage DSA based on the information obtained from the agents' search activity. Algorithms and methods were developed to describe how agents update the BBN to model the system’s DSA, predict plausible future states of the agents’ search area, handle uncertainties, manage agents’ beliefs (based on sensor differences), monitor agents’ interactions, and maintains adaptable BBN for DSA management using structural learning. The evaluation uses environment situation information obtained from agents’ sensors during search activity, and the results proved superior performance over well-known alternative methods in terms of situation prediction accuracy, uncertainty handling, and adaptability. Therefore, the thesis’s main contributions are (i) the development of a simple search planning algorithm that combines the strength of fixed-pattern and pseudorandom methods with resources utilisation, scalability, adaptability, and predictability features; (ii) a formal model of DSA using BBN that can be updated and learnt during the mission; (iii) investigation of the relationship between agents search coordination and DSA management
    corecore