13,269 research outputs found
Mobile Edge Computation Offloading Using Game Theory and Reinforcement Learning
Due to the ever-increasing popularity of resource-hungry and
delay-constrained mobile applications, the computation and storage capabilities
of remote cloud has partially migrated towards the mobile edge, giving rise to
the concept known as Mobile Edge Computing (MEC). While MEC servers enjoy the
close proximity to the end-users to provide services at reduced latency and
lower energy costs, they suffer from limitations in computational and radio
resources, which calls for fair efficient resource management in the MEC
servers. The problem is however challenging due to the ultra-high density,
distributed nature, and intrinsic randomness of next generation wireless
networks. In this article, we focus on the application of game theory and
reinforcement learning for efficient distributed resource management in MEC, in
particular, for computation offloading. We briefly review the cutting-edge
research and discuss future challenges. Furthermore, we develop a
game-theoretical model for energy-efficient distributed edge server activation
and study several learning techniques. Numerical results are provided to
illustrate the performance of these distributed learning techniques. Also, open
research issues in the context of resource management in MEC servers are
discussed
Unsupervised Real-Time Control through Variational Empowerment
We introduce a methodology for efficiently computing a lower bound to
empowerment, allowing it to be used as an unsupervised cost function for policy
learning in real-time control. Empowerment, being the channel capacity between
actions and states, maximises the influence of an agent on its near future. It
has been shown to be a good model of biological behaviour in the absence of an
extrinsic goal. But empowerment is also prohibitively hard to compute,
especially in nonlinear continuous spaces. We introduce an efficient, amortised
method for learning empowerment-maximising policies. We demonstrate that our
algorithm can reliably handle continuous dynamical systems using system
dynamics learned from raw data. The resulting policies consistently drive the
agents into states where they can use their full potential
Agent Embeddings: A Latent Representation for Pole-Balancing Networks
We show that it is possible to reduce a high-dimensional object like a neural
network agent into a low-dimensional vector representation with semantic
meaning that we call agent embeddings, akin to word or face embeddings. This
can be done by collecting examples of existing networks, vectorizing their
weights, and then learning a generative model over the weight space in a
supervised fashion. We investigate a pole-balancing task, Cart-Pole, as a case
study and show that multiple new pole-balancing networks can be generated from
their agent embeddings without direct access to training data from the
Cart-Pole simulator. In general, the learned embedding space is helpful for
mapping out the space of solutions for a given task. We observe in the case of
Cart-Pole the surprising finding that good agents make different decisions
despite learning similar representations, whereas bad agents make similar (bad)
decisions while learning dissimilar representations. Linearly interpolating
between the latent embeddings for a good agent and a bad agent yields an agent
embedding that generates a network with intermediate performance, where the
performance can be tuned according to the coefficient of interpolation. Linear
extrapolation in the latent space also results in performance boosts, up to a
point
Risk-Aware Energy Scheduling for Edge Computing with Microgrid: A Multi-Agent Deep Reinforcement Learning Approach
In recent years, multi-access edge computing (MEC) is a key enabler for
handling the massive expansion of Internet of Things (IoT) applications and
services. However, energy consumption of a MEC network depends on volatile
tasks that induces risk for energy demand estimations. As an energy supplier, a
microgrid can facilitate seamless energy supply. However, the risk associated
with energy supply is also increased due to unpredictable energy generation
from renewable and non-renewable sources. Especially, the risk of energy
shortfall is involved with uncertainties in both energy consumption and
generation. In this paper, we study a risk-aware energy scheduling problem for
a microgrid-powered MEC network. First, we formulate an optimization problem
considering the conditional value-at-risk (CVaR) measurement for both energy
consumption and generation, where the objective is to minimize the expected
residual of scheduled energy for the MEC networks and we show this problem is
an NP-hard problem. Second, we analyze our formulated problem using a
multi-agent stochastic game that ensures the joint policy Nash equilibrium, and
show the convergence of the proposed model. Third, we derive the solution by
applying a multi-agent deep reinforcement learning (MADRL)-based asynchronous
advantage actor-critic (A3C) algorithm with shared neural networks. This method
mitigates the curse of dimensionality of the state space and chooses the best
policy among the agents for the proposed problem. Finally, the experimental
results establish a significant performance gain by considering CVaR for high
accuracy energy scheduling of the proposed model than both the single and
random agent models.Comment: Accepted Article BY IEEE Transactions on Network and Service
Management, DOI: 10.1109/TNSM.2021.304938
Cross-Domain Transfer in Reinforcement Learning using Target Apprentice
In this paper, we present a new approach to Transfer Learning (TL) in
Reinforcement Learning (RL) for cross-domain tasks. Many of the available
techniques approach the transfer architecture as a method of speeding up the
target task learning. We propose to adapt and reuse the mapped source task
optimal-policy directly in related domains. We show the optimal policy from a
related source task can be near optimal in target domain provided an adaptive
policy accounts for the model error between target and source. The main benefit
of this policy augmentation is generalizing policies across multiple related
domains without having to re-learn the new tasks. Our results show that this
architecture leads to better sample efficiency in the transfer, reducing sample
complexity of target task learning to target apprentice learning.Comment: To appear as conference paper in ICRA 201
Application of Machine Learning in Wireless Networks: Key Techniques and Open Issues
As a key technique for enabling artificial intelligence, machine learning
(ML) is capable of solving complex problems without explicit programming.
Motivated by its successful applications to many practical tasks like image
recognition, both industry and the research community have advocated the
applications of ML in wireless communication. This paper comprehensively
surveys the recent advances of the applications of ML in wireless
communication, which are classified as: resource management in the MAC layer,
networking and mobility management in the network layer, and localization in
the application layer. The applications in resource management further include
power control, spectrum management, backhaul management, cache management,
beamformer design and computation resource management, while ML based
networking focuses on the applications in clustering, base station switching
control, user association and routing. Moreover, literatures in each aspect is
organized according to the adopted ML techniques. In addition, several
conditions for applying ML to wireless communication are identified to help
readers decide whether to use ML and which kind of ML techniques to use, and
traditional approaches are also summarized together with their performance
comparison with ML based approaches, based on which the motivations of surveyed
literatures to adopt ML are clarified. Given the extensiveness of the research
area, challenges and unresolved issues are presented to facilitate future
studies, where ML based network slicing, infrastructure update to support ML
based paradigms, open data sets and platforms for researchers, theoretical
guidance for ML implementation and so on are discussed.Comment: 34 pages,8 figure
A Tour of Reinforcement Learning: The View from Continuous Control
This manuscript surveys reinforcement learning from the perspective of
optimization and control with a focus on continuous control applications. It
surveys the general formulation, terminology, and typical experimental
implementations of reinforcement learning and reviews competing solution
paradigms. In order to compare the relative merits of various techniques, this
survey presents a case study of the Linear Quadratic Regulator (LQR) with
unknown dynamics, perhaps the simplest and best-studied problem in optimal
control. The manuscript describes how merging techniques from learning theory
and control can provide non-asymptotic characterizations of LQR performance and
shows that these characterizations tend to match experimental behavior. In
turn, when revisiting more complex applications, many of the observed phenomena
in LQR persist. In particular, theory and experiment demonstrate the role and
importance of models and the cost of generality in reinforcement learning
algorithms. This survey concludes with a discussion of some of the challenges
in designing learning systems that safely and reliably interact with complex
and uncertain environments and how tools from reinforcement learning and
control might be combined to approach these challenges.Comment: minor revision with a few clarifying passages and corrected typo
Flow: A Modular Learning Framework for Autonomy in Traffic
The rapid development of autonomous vehicles (AVs) holds vast potential for
transportation systems through improved safety, efficiency, and access to
mobility. However, due to numerous technical, political, and human factors
challenges, new methodologies are needed to design vehicles and transportation
systems for these positive outcomes. This article tackles technical challenges
arising from the partial adoption of autonomy: partial control, partial
observation, complex multi-vehicle interactions, and the sheer variety of
traffic settings represented by real-world networks. The article presents a
modular learning framework which leverages deep Reinforcement Learning methods
to address complex traffic dynamics. Modules are composed to capture common
traffic phenomena (traffic jams, lane changing, intersections). Learned control
laws are found to exceed human driving performance by at least 40% with only
5-10% adoption of AVs. In partially-observed single-lane traffic, a small
neural network control law can eliminate stop-and-go traffic -- surpassing all
known model-based controllers, achieving near-optimal performance, and
generalizing to out-of-distribution traffic densities.Comment: 14 pages, 8 figures; new experiments and analysi
Multi-Task Generative Adversarial Nets with Shared Memory for Cross-Domain Coordination Control
Generating sequential decision process from huge amounts of measured process
data is a future research direction for collaborative factory automation,
making full use of those online or offline process data to directly design
flexible make decisions policy, and evaluate performance. The key challenges
for the sequential decision process is to online generate sequential
decision-making policy directly, and transferring knowledge across tasks
domain. Most multi-task policy generating algorithms often suffer from
insufficient generating cross-task sharing structure at discrete-time nonlinear
systems with applications. This paper proposes the multi-task generative
adversarial nets with shared memory for cross-domain coordination control,
which can generate sequential decision policy directly from raw sensory input
of all of tasks, and online evaluate performance of system actions in
discrete-time nonlinear systems. Experiments have been undertaken using a
professional flexible manufacturing testbed deployed within a smart factory of
Weichai Power in China. Results on three groups of discrete-time nonlinear
control tasks show that our proposed model can availably improve the
performance of task with the help of other related tasks
Towards a Hands-Free Query Optimizer through Deep Learning
Query optimization remains one of the most important and well-studied
problems in database systems. However, traditional query optimizers are complex
heuristically-driven systems, requiring large amounts of time to tune for a
particular database and requiring even more time to develop and maintain in the
first place. In this vision paper, we argue that a new type of query optimizer,
based on deep reinforcement learning, can drastically improve on the
state-of-the-art. We identify potential complications for future research that
integrates deep learning with query optimization, and we describe three novel
deep learning based approaches that can lead the way to end-to-end
learning-based query optimizers.Comment: Published in CIDR1
- …