Search CORE

2 research outputs found

Resource Abstraction for Reinforcement Learning in Multiagent Congestion Problems

Author: Devlin Sam
Kudenko Daniel
Malialis Kleanthis
Publication venue
Publication date: 13/03/2019
Field of study

Real-world congestion problems (e.g. traffic congestion) are typically very complex and large-scale. Multiagent reinforcement learning (MARL) is a promising candidate for dealing with this emerging complexity by providing an autonomous and distributed solution to these problems. However, there are three limiting factors that affect the deployability of MARL approaches to congestion problems. These are learning time, scalability and decentralised coordination i.e. no communication between the learning agents. In this paper we introduce Resource Abstraction, an approach that addresses these challenges by allocating the available resources into abstract groups. This abstraction creates new reward functions that provide a more informative signal to the learning agents and aid the coordination amongst them. Experimental work is conducted on two benchmark domains from the literature, an abstract congestion problem and a realistic traffic congestion problem. The current state-of-the-art for solving multiagent congestion problems is a form of reward shaping called difference rewards. We show that the system using Resource Abstraction significantly improves the learning speed and scalability, and achieves the highest possible or near-highest joint performance/social welfare for both congestion problems in large-scale scenarios involving up to 1000 reinforcement learning agents.Comment: Keywords: congestion problems, resource management, multiagent reinforcement learning, multiagent systems, multiagent learning, resource abstraction. In Proceedings of the 2016 International Conference on Autonomous Agents and Multiagent Systems (AAMAS '16

arXiv.org e-Print Archive

Adversarial Deep Reinforcement Learning based Adaptive Moving Target Defense

Author: Eghtesad Taha
Laszka Aron
Vorobeychik Yevgeniy
Publication venue
Publication date: 20/08/2020
Field of study

Moving target defense (MTD) is a proactive defense approach that aims to thwart attacks by continuously changing the attack surface of a system (e.g., changing host or network configurations), thereby increasing the adversary's uncertainty and attack cost. To maximize the impact of MTD, a defender must strategically choose when and what changes to make, taking into account both the characteristics of its system as well as the adversary's observed activities. Finding an optimal strategy for MTD presents a significant challenge, especially when facing a resourceful and determined adversary who may respond to the defender's actions. In this paper, we propose a multi-agent partially-observable Markov Decision Process model of MTD and formulate a two-player general-sum game between the adversary and the defender. Based on an established model of adaptive MTD, we propose a multi-agent reinforcement learning framework based on the double oracle algorithm to solve the game. In the experiments, we show the effectiveness of our framework in finding optimal policies

arXiv.org e-Print Archive