3,088 research outputs found
Learning to Schedule Communication in Multi-agent Reinforcement Learning
Many real-world reinforcement learning tasks require multiple agents to make
sequential decisions under the agents' interaction, where well-coordinated
actions among the agents are crucial to achieve the target goal better at these
tasks. One way to accelerate the coordination effect is to enable multiple
agents to communicate with each other in a distributed manner and behave as a
group. In this paper, we study a practical scenario when (i) the communication
bandwidth is limited and (ii) the agents share the communication medium so that
only a restricted number of agents are able to simultaneously use the medium,
as in the state-of-the-art wireless networking standards. This calls for a
certain form of communication scheduling. In that regard, we propose a
multi-agent deep reinforcement learning framework, called SchedNet, in which
agents learn how to schedule themselves, how to encode the messages, and how to
select actions based on received messages. SchedNet is capable of deciding
which agents should be entitled to broadcasting their (encoded) messages, by
learning the importance of each agent's partially observed information. We
evaluate SchedNet against multiple baselines under two different applications,
namely, cooperative communication and navigation, and predator-prey. Our
experiments show a non-negligible performance gap between SchedNet and other
mechanisms such as the ones without communication and with vanilla scheduling
methods, e.g., round robin, ranging from 32% to 43%.Comment: Accepted in ICLR 201
A Multi-Agent Deep Reinforcement Learning based Spectrum Allocation Framework for D2D Communications
Device-to-device (D2D) communication has been recognized as a promising
technique to improve spectrum efficiency. However, D2D transmission as an
underlay causes severe interference, which imposes a technical challenge to
spectrum allocation. Existing centralized schemes require global information,
which can cause serious signaling overhead. While existing distributed solution
requires frequent information exchange between users and cannot achieve global
optimization. In this paper, a distributed spectrum allocation framework based
on multi-agent deep reinforcement learning is proposed, named Neighbor-Agent
Actor Critic (NAAC). NAAC uses neighbor users' historical information for
centralized training but is executed distributedly without that information,
which not only has no signal interaction during execution, but also utilizes
cooperation between users to further optimize system performance. The
simulation results show that the proposed framework can effectively reduce the
outage probability of cellular links, improve the sum rate of D2D links and
have good convergence.Comment: Accepted to Globecom 201
A Survey and Critique of Multiagent Deep Reinforcement Learning
Deep reinforcement learning (RL) has achieved outstanding results in recent
years. This has led to a dramatic increase in the number of applications and
methods. Recent works have explored learning beyond single-agent scenarios and
have considered multiagent learning (MAL) scenarios. Initial results report
successes in complex multiagent domains, although there are several challenges
to be addressed. The primary goal of this article is to provide a clear
overview of current multiagent deep reinforcement learning (MDRL) literature.
Additionally, we complement the overview with a broader analysis: (i) we
revisit previous key components, originally presented in MAL and RL, and
highlight how they have been adapted to multiagent deep reinforcement learning
settings. (ii) We provide general guidelines to new practitioners in the area:
describing lessons learned from MDRL works, pointing to recent benchmarks, and
outlining open avenues of research. (iii) We take a more critical tone raising
practical challenges of MDRL (e.g., implementation and computational demands).
We expect this article will help unify and motivate future research to take
advantage of the abundant literature that exists (e.g., RL and MAL) in a joint
effort to promote fruitful research in the multiagent community.Comment: Under review since Oct 2018. Earlier versions of this work had the
title: "Is multiagent deep reinforcement learning the answer or the question?
A brief survey
Applications of Deep Reinforcement Learning in Communications and Networking: A Survey
This paper presents a comprehensive literature review on applications of deep
reinforcement learning in communications and networking. Modern networks, e.g.,
Internet of Things (IoT) and Unmanned Aerial Vehicle (UAV) networks, become
more decentralized and autonomous. In such networks, network entities need to
make decisions locally to maximize the network performance under uncertainty of
network environment. Reinforcement learning has been efficiently used to enable
the network entities to obtain the optimal policy including, e.g., decisions or
actions, given their states when the state and action spaces are small.
However, in complex and large-scale networks, the state and action spaces are
usually large, and the reinforcement learning may not be able to find the
optimal policy in reasonable time. Therefore, deep reinforcement learning, a
combination of reinforcement learning with deep learning, has been developed to
overcome the shortcomings. In this survey, we first give a tutorial of deep
reinforcement learning from fundamental concepts to advanced models. Then, we
review deep reinforcement learning approaches proposed to address emerging
issues in communications and networking. The issues include dynamic network
access, data rate control, wireless caching, data offloading, network security,
and connectivity preservation which are all important to next generation
networks such as 5G and beyond. Furthermore, we present applications of deep
reinforcement learning for traffic routing, resource sharing, and data
collection. Finally, we highlight important challenges, open issues, and future
research directions of applying deep reinforcement learning.Comment: 37 pages, 13 figures, 6 tables, 174 reference paper
Application of Machine Learning in Wireless Networks: Key Techniques and Open Issues
As a key technique for enabling artificial intelligence, machine learning
(ML) is capable of solving complex problems without explicit programming.
Motivated by its successful applications to many practical tasks like image
recognition, both industry and the research community have advocated the
applications of ML in wireless communication. This paper comprehensively
surveys the recent advances of the applications of ML in wireless
communication, which are classified as: resource management in the MAC layer,
networking and mobility management in the network layer, and localization in
the application layer. The applications in resource management further include
power control, spectrum management, backhaul management, cache management,
beamformer design and computation resource management, while ML based
networking focuses on the applications in clustering, base station switching
control, user association and routing. Moreover, literatures in each aspect is
organized according to the adopted ML techniques. In addition, several
conditions for applying ML to wireless communication are identified to help
readers decide whether to use ML and which kind of ML techniques to use, and
traditional approaches are also summarized together with their performance
comparison with ML based approaches, based on which the motivations of surveyed
literatures to adopt ML are clarified. Given the extensiveness of the research
area, challenges and unresolved issues are presented to facilitate future
studies, where ML based network slicing, infrastructure update to support ML
based paradigms, open data sets and platforms for researchers, theoretical
guidance for ML implementation and so on are discussed.Comment: 34 pages,8 figure
Multi-Agent Deep Reinforcement Learning for Large-scale Traffic Signal Control
Reinforcement learning (RL) is a promising data-driven approach for adaptive
traffic signal control (ATSC) in complex urban traffic networks, and deep
neural networks further enhance its learning power. However, centralized RL is
infeasible for large-scale ATSC due to the extremely high dimension of the
joint action space. Multi-agent RL (MARL) overcomes the scalability issue by
distributing the global control to each local RL agent, but it introduces new
challenges: now the environment becomes partially observable from the viewpoint
of each local agent due to limited communication among agents. Most existing
studies in MARL focus on designing efficient communication and coordination
among traditional Q-learning agents. This paper presents, for the first time, a
fully scalable and decentralized MARL algorithm for the state-of-the-art deep
RL agent: advantage actor critic (A2C), within the context of ATSC. In
particular, two methods are proposed to stabilize the learning procedure, by
improving the observability and reducing the learning difficulty of each local
agent. The proposed multi-agent A2C is compared against independent A2C and
independent Q-learning algorithms, in both a large synthetic traffic grid and a
large real-world traffic network of Monaco city, under simulated peak-hour
traffic dynamics. Results demonstrate its optimality, robustness, and sample
efficiency over other state-of-the-art decentralized MARL algorithms
A Review of Reinforcement Learning for Autonomous Building Energy Management
The area of building energy management has received a significant amount of
interest in recent years. This area is concerned with combining advancements in
sensor technologies, communications and advanced control algorithms to optimize
energy utilization. Reinforcement learning is one of the most prominent machine
learning algorithms used for control problems and has had many successful
applications in the area of building energy management. This research gives a
comprehensive review of the literature relating to the application of
reinforcement learning to developing autonomous building energy management
systems. The main direction for future research and challenges in reinforcement
learning are also outlined.Comment: 17 pages, 3 figure
Multi-Agent Actor-Critic with Generative Cooperative Policy Network
We propose an efficient multi-agent reinforcement learning approach to derive
equilibrium strategies for multi-agents who are participating in a Markov game.
Mainly, we are focused on obtaining decentralized policies for agents to
maximize the performance of a collaborative task by all the agents, which is
similar to solving a decentralized Markov decision process. We propose to use
two different policy networks: (1) decentralized greedy policy network used to
generate greedy action during training and execution period and (2) generative
cooperative policy network (GCPN) used to generate action samples to make other
agents improve their objectives during training period. We show that the
samples generated by GCPN enable other agents to explore the policy space more
effectively and favorably to reach a better policy in terms of achieving the
collaborative tasks.Comment: 10 pages, total 9 figures including all sub-figure
Vehicular Edge Computing via Deep Reinforcement Learning
The smart vehicles construct Vehicle of Internet which can execute various
intelligent services. Although the computation capability of the vehicle is
limited, multi-type of edge computing nodes provide heterogeneous resources for
vehicular services.When offloading the complicated service to the vehicular
edge computing node, the decision should consider numerous factors.The
offloading decision work mostly formulate the decision to a resource scheduling
problem with single or multiple objective function and some constraints, and
explore customized heuristics algorithms. However, offloading multiple data
dependency tasks in a service is a difficult decision, as an optimal solution
must understand the resource requirement, the access network, the user
mobility, and importantly the data dependency. Inspired by recent advances in
machine learning, we propose a knowledge driven (KD) service offloading
decision framework for Vehicle of Internet, which provides the optimal policy
directly from the environment. We formulate the offloading decision of
multi-task in a service as a long-term planning problem, and explores the
recent deep reinforcement learning to obtain the optimal solution. It considers
the future data dependency of the following tasks when making decision for a
current task from the learned offloading knowledge. Moreover, the framework
supports the pre-training at the powerful edge computing node and continually
online learning when the vehicular service is executed, so that it can adapt
the environment changes and learns policy that are sensible in hindsight. The
simulation results show that KD service offloading decision converges quickly,
adapts to different conditions, and outperforms the greedy offloading decision
algorithm.Comment: Preliminary report of ongoing wor
Experience Augmentation: Boosting and Accelerating Off-Policy Multi-Agent Reinforcement Learning
Exploration of the high-dimensional state action space is one of the biggest
challenges in Reinforcement Learning (RL), especially in multi-agent domain. We
present a novel technique called Experience Augmentation, which enables a
time-efficient and boosted learning based on a fast, fair and thorough
exploration to the environment. It can be combined with arbitrary off-policy
MARL algorithms and is applicable to either homogeneous or heterogeneous
environments. We demonstrate our approach by combining it with MADDPG and
verifing the performance in two homogeneous and one heterogeneous environments.
In the best performing scenario, the MADDPG with experience augmentation
reaches to the convergence reward of vanilla MADDPG with 1/4 realistic time,
and its convergence beats the original model by a significant margin. Our
ablation studies show that experience augmentation is a crucial ingredient
which accelerates the training process and boosts the convergence.Comment: 10 pages, 4 figures, 4 table
- …