527 research outputs found
Advances in practical optimal coalition structure algorithms
This thesis presents a number of algorithms for forming coalitions among cooperative agents in pragmatic domains where traditional cooperative game theory solution concepts do not apply due to bounded rationality of agents. While previous work in coalition formation in multi-agent systems research operated on relatively small number of agents, e.g. less than 30 agents, this work explores coalition formation among 100 agents, this is due to limited computational resources not the performance of the our algorithms. We explore a bestfirst search centralized algorithm for optimal coalition structures which is based on a novel idea of deciding what is the best coalition to put into coalition structure being generated. Empirical results show that the solution reaches optimality quickly and terminates quickly in pragmatic domains. We further explore on optimal coalition structures with distributed algorithms in linear and non-linear domains. For the linear domains, we explore linear production and integer programming. For the non-linear domains we explore logistic providers. Based on existing algorithms, we explore a novel environment of forming coalitions in supply networks involving buyers, sellers and logistics providers agents. In this setting, buyers form coalitions to increase their negotiation power while sellers and logistics providers form coalitions to aggregate their supply power and optimize their resources usage
A Game-Theoretic Approach to Strategic Resource Allocation Mechanisms in Edge and Fog Computing
With the rapid growth of Internet of Things (IoT), cloud-centric application management raises
questions related to quality of service for real-time applications. Fog and edge computing
(FEC) provide a complement to the cloud by filling the gap between cloud and IoT. Resource
management on multiple resources from distributed and administrative FEC nodes is a key
challenge to ensure the quality of end-user’s experience. To improve resource utilisation and
system performance, researchers have been proposed many fair allocation mechanisms for
resource management. Dominant Resource Fairness (DRF), a resource allocation policy for
multiple resource types, meets most of the required fair allocation characteristics. However,
DRF is suitable for centralised resource allocation without considering the effects (or
feedbacks) of large-scale distributed environments like multi-controller software defined
networking (SDN). Nash bargaining from micro-economic theory or competitive equilibrium
equal incomes (CEEI) are well suited to solving dynamic optimisation problems proposing to
‘proportionately’ share resources among distributed participants. Although CEEI’s
decentralised policy guarantees load balancing for performance isolation, they are not faultproof
for computation offloading.
The thesis aims to propose a hybrid and fair allocation mechanism for rejuvenation of
decentralised SDN controller deployment. We apply multi-agent reinforcement learning
(MARL) with robustness against adversarial controllers to enable efficient priority scheduling
for FEC. Motivated by software cybernetics and homeostasis, weighted DRF is generalised by
applying the principles of feedback (positive or/and negative network effects) in reverse game
theory (GT) to design hybrid scheduling schemes for joint multi-resource and multitask
offloading/forwarding in FEC environments.
In the first piece of study, monotonic scheduling for joint offloading at the federated edge is
addressed by proposing truthful mechanism (algorithmic) to neutralise harmful negative and
positive distributive bargain externalities respectively. The IP-DRF scheme is a MARL
approach applying partition form game (PFG) to guarantee second-best Pareto optimality
viii | P a g e
(SBPO) in allocation of multi-resources from deterministic policy in both population and
resource non-monotonicity settings. In the second study, we propose DFog-DRF scheme to
address truthful fog scheduling with bottleneck fairness in fault-probable wireless hierarchical
networks by applying constrained coalition formation (CCF) games to implement MARL. The
multi-objective optimisation problem for fog throughput maximisation is solved via a
constraint dimensionality reduction methodology using fairness constraints for efficient
gateway and low-level controller’s placement.
For evaluation, we develop an agent-based framework to implement fair allocation policies in
distributed data centre environments. In empirical results, the deterministic policy of IP-DRF
scheme provides SBPO and reduces the average execution and turnaround time by 19% and
11.52% as compared to the Nash bargaining or CEEI deterministic policy for 57,445 cloudlets
in population non-monotonic settings. The processing cost of tasks shows significant
improvement (6.89% and 9.03% for fixed and variable pricing) for the resource non-monotonic
setting - using 38,000 cloudlets. The DFog-DRF scheme when benchmarked against asset fair
(MIP) policy shows superior performance (less than 1% in time complexity) for up to 30 FEC
nodes. Furthermore, empirical results using 210 mobiles and 420 applications prove the
efficacy of our hybrid scheduling scheme for hierarchical clustering considering latency and
network usage for throughput maximisation.Abubakar Tafawa Balewa University, Bauchi (Tetfund, Nigeria
Algorithms for Modular Self-reconfigurable Robots: Decision Making, Planning, and Learning
Modular self-reconfigurable robots (MSRs) are composed of multiple robotic modules which can change their connections with each other to take different shapes, commonly known as configurations. Forming different configurations helps the MSR to accomplish different types of tasks in different environments. In this dissertation, we study three different problems in MSRs: partitioning of modules, configuration formation planning and locomotion learning, and we propose algorithmic solutions to solve these problems.
Partitioning of modules is a decision-making problem for MSRs where each module decides which partition or team of modules it should be in. To find the best set of partitions is a NP-complete problem. We propose game theory based both centralized and distributed solutions to solve this problem. Once the modules know which set of modules they should team-up with, they self-aggregate to form a specific shaped configuration, known as the configuration formation planning problem. Modules can be either singletons or connected in smaller configurations from which they need to form the target configuration. The configuration formation problem is difficult as multiple modules may select the same location in the target configuration to move to which might result in occlusion and consequently failure of the configuration formation process. On the other hand, if the modules are already in connected configurations in the beginning, then it would be beneficial to preserve those initial configurations for placing them into the target configuration as disconnections and re-connections are costly operations. We propose solutions based on an auction-like algorithm and (sub) graph-isomorphism technique to solve the configuration formation problem.
Once the configuration is built, the MSR needs to move towards its goal location as a whole configuration for completing its task. If the configuration’s shape and size is not known a priori, then planning its locomotion is a difficult task as it needs to learn the locomotion pattern in dynamic time – the problem is known as adaptive locomotion learning. We have proposed reinforcement learning based fault-tolerant solutions for locomotion learning by MSRs
Gaining Insight into Determinants of Physical Activity using Bayesian Network Learning
Contains fulltext :
228326pre.pdf (preprint version ) (Open Access)
Contains fulltext :
228326pub.pdf (publisher's version ) (Open Access)BNAIC/BeneLearn 202
A Practical Guide to Multi-Objective Reinforcement Learning and Planning
Real-world decision-making tasks are generally complex, requiring trade-offs
between multiple, often conflicting, objectives. Despite this, the majority of
research in reinforcement learning and decision-theoretic planning either
assumes only a single objective, or that multiple objectives can be adequately
handled via a simple linear combination. Such approaches may oversimplify the
underlying problem and hence produce suboptimal results. This paper serves as a
guide to the application of multi-objective methods to difficult problems, and
is aimed at researchers who are already familiar with single-objective
reinforcement learning and planning methods who wish to adopt a multi-objective
perspective on their research, as well as practitioners who encounter
multi-objective decision problems in practice. It identifies the factors that
may influence the nature of the desired solution, and illustrates by example
how these influence the design of multi-objective decision-making systems for
complex problems
A practical guide to multi-objective reinforcement learning and planning
Real-world sequential decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems. © 2022, The Author(s)
Recommended from our members
Application of Techniques for MAP Estimation to Distributed Constraint Optimization Problem
The problem of efficiently finding near-optimal decisions in multi-agent systems has become increasingly important because of the growing number of multi-agent applications with large numbers of agents operating in real-world environments. In these systems, agents are often subject to tight resource constraints and agents have only local views. When agents have non-global constraints, each of which is independent, the problem can be formalized as a distributed constraint optimization problem (DCOP). The DCOP is closely associated with the problem of inference on graphical models. Many approaches from inference literature have been adopted to solve DCOPs. We focus on the Max-Sum algorithm and the Action-GDL algorithm that are DCOP variants of the popular inference algorithm called the Max-Product algorithm and the Belief Propagation algorithm respectively. The Max-Sum algorithm and the Action-GDL algorithm are well-suited for multi-agent systems because it is distributed by nature and requires less communication than most DCOP algorithms. However, the resource requirements of these algorithms are still high for some multi-agent domains and various aspects of the algorithms have not been well studied for use in general multi-agent settings.
This thesis is concerned with a variety of issues of applying the Max-Sum algorithms and the Action-GDL algorithm to general multi-agent settings. We develop a hybrid algorithm of ADOPT and Action-GDL in order to overcome the communication complexity of DCOPs. Secondly, we extend the Max-Sum algorithm to operate more efficiently in more general multi-agent settings in which computational complexity is high. We provide an algorithm that has a lower expected computational complexity for DCOPs even with n-ary constraints. Finally, In most DCOP literature, a one-to-one mapping between a variable and an agent is assumed. However, in real applications, many-to-one mappings are prevalent and can also be beneficial in terms of communication and hardware cost in situations where agents are acting as independent computing units. We consider how to exploit such mapping in order to increase efficiency
Monte Carlo Tree Search Applied to a Modified Pursuit/Evasion Scotland Yard Game with Rendezvous Spaceflight Operation Applications
This thesis takes the Scotland Yard board game and modifies its rules to mimic important aspects of space in order to facilitate the creation of artificial intelligence for space asset pursuit/evasion scenarios. Space has become a physical warfighting domain. To combat threats, an understanding of the tactics, techniques, and procedures must be captured and studied. Games and simulations are effective tools to capture data lacking historical context. Artificial intelligence and machine learning models can use simulations to develop proper defensive and offensive tactics, techniques, and procedures capable of protecting systems against potential threats. Monte Carlo Tree Search is a bandit-based reinforcement learning model known for using limited domain knowledge to push favorable results. Monte Carlo agents have been used in a multitude of imperfect domain knowledge games. One such game was in which Monte Carlo agents were produced and studied in an imperfect domain game for pursuit-evasion tactics is Scotland Yard. This thesis continues the Monte Carlo agents previously produced by Mark Winands and Pim Nijssen and applied to Scotland Yard. In the research presented here, the rules for Scotland Yard are analyzed and presented in an expansion that partially accounts for spaceflight dynamics in order to study the agents within a simplified model, while having some foundation for use within space environments. Results show promise for the use of Monte- Carlo agents in pursuit/evasion autonomous space scenarios while also illuminating some major challenges for future work in more realistic three-dimensional space environments
- …