Search CORE

13,269 research outputs found

Mobile Edge Computation Offloading Using Game Theory and Reinforcement Learning

Author: Hossain Ekram
Maghsudi Setareh
Ranadheera Shermila
Publication venue
Publication date: 19/11/2017
Field of study

Due to the ever-increasing popularity of resource-hungry and delay-constrained mobile applications, the computation and storage capabilities of remote cloud has partially migrated towards the mobile edge, giving rise to the concept known as Mobile Edge Computing (MEC). While MEC servers enjoy the close proximity to the end-users to provide services at reduced latency and lower energy costs, they suffer from limitations in computational and radio resources, which calls for fair efficient resource management in the MEC servers. The problem is however challenging due to the ultra-high density, distributed nature, and intrinsic randomness of next generation wireless networks. In this article, we focus on the application of game theory and reinforcement learning for efficient distributed resource management in MEC, in particular, for computation offloading. We briefly review the cutting-edge research and discuss future challenges. Furthermore, we develop a game-theoretical model for energy-efficient distributed edge server activation and study several learning techniques. Numerical results are provided to illustrate the performance of these distributed learning techniques. Also, open research issues in the context of resource management in MEC servers are discussed

arXiv.org e-Print Archive

Unsupervised Real-Time Control through Variational Empowerment

Author: Bayer Justin
Becker-Ehmck Philip
Benbouzid Djalel
Karl Maximilian
Soelch Maximilian
van der Smagt Patrick
Publication venue
Publication date: 13/10/2017
Field of study

We introduce a methodology for efficiently computing a lower bound to empowerment, allowing it to be used as an unsupervised cost function for policy learning in real-time control. Empowerment, being the channel capacity between actions and states, maximises the influence of an agent on its near future. It has been shown to be a good model of biological behaviour in the absence of an extrinsic goal. But empowerment is also prohibitively hard to compute, especially in nonlinear continuous spaces. We introduce an efficient, amortised method for learning empowerment-maximising policies. We demonstrate that our algorithm can reliably handle continuous dynamical systems using system dynamics learned from raw data. The resulting policies consistently drive the agents into states where they can use their full potential

arXiv.org e-Print Archive

Agent Embeddings: A Latent Representation for Pole-Balancing Networks

Author: Chang Oscar
Chen Siyuan
Kwiatkowski Robert
Lipson Hod
Publication venue
Publication date: 18/03/2019
Field of study

We show that it is possible to reduce a high-dimensional object like a neural network agent into a low-dimensional vector representation with semantic meaning that we call agent embeddings, akin to word or face embeddings. This can be done by collecting examples of existing networks, vectorizing their weights, and then learning a generative model over the weight space in a supervised fashion. We investigate a pole-balancing task, Cart-Pole, as a case study and show that multiple new pole-balancing networks can be generated from their agent embeddings without direct access to training data from the Cart-Pole simulator. In general, the learned embedding space is helpful for mapping out the space of solutions for a given task. We observe in the case of Cart-Pole the surprising finding that good agents make different decisions despite learning similar representations, whereas bad agents make similar (bad) decisions while learning dissimilar representations. Linearly interpolating between the latent embeddings for a good agent and a bad agent yields an agent embedding that generates a network with intermediate performance, where the performance can be tuned according to the coefficient of interpolation. Linear extrapolation in the latent space also results in performance boosts, up to a point

arXiv.org e-Print Archive

Risk-Aware Energy Scheduling for Edge Computing with Microgrid: A Multi-Agent Deep Reinforcement Learning Approach

Author: Abedin Sarder Fakhrul
Han Zhu
Hong Choong Seon
Huh Eui-Nam
Munir Md. Shirajum
Tran Nguyen H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/01/2021
Field of study

In recent years, multi-access edge computing (MEC) is a key enabler for handling the massive expansion of Internet of Things (IoT) applications and services. However, energy consumption of a MEC network depends on volatile tasks that induces risk for energy demand estimations. As an energy supplier, a microgrid can facilitate seamless energy supply. However, the risk associated with energy supply is also increased due to unpredictable energy generation from renewable and non-renewable sources. Especially, the risk of energy shortfall is involved with uncertainties in both energy consumption and generation. In this paper, we study a risk-aware energy scheduling problem for a microgrid-powered MEC network. First, we formulate an optimization problem considering the conditional value-at-risk (CVaR) measurement for both energy consumption and generation, where the objective is to minimize the expected residual of scheduled energy for the MEC networks and we show this problem is an NP-hard problem. Second, we analyze our formulated problem using a multi-agent stochastic game that ensures the joint policy Nash equilibrium, and show the convergence of the proposed model. Third, we derive the solution by applying a multi-agent deep reinforcement learning (MADRL)-based asynchronous advantage actor-critic (A3C) algorithm with shared neural networks. This method mitigates the curse of dimensionality of the state space and chooses the best policy among the agents for the proposed problem. Finally, the experimental results establish a significant performance gain by considering CVaR for high accuracy energy scheduling of the proposed model than both the single and random agent models.Comment: Accepted Article BY IEEE Transactions on Network and Service Management, DOI: 10.1109/TNSM.2021.304938

arXiv.org e-Print Archive

Cross-Domain Transfer in Reinforcement Learning using Target Apprentice

Author: Chowdhary Girish
Joshi Girish
Publication venue
Publication date: 21/01/2018
Field of study

In this paper, we present a new approach to Transfer Learning (TL) in Reinforcement Learning (RL) for cross-domain tasks. Many of the available techniques approach the transfer architecture as a method of speeding up the target task learning. We propose to adapt and reuse the mapped source task optimal-policy directly in related domains. We show the optimal policy from a related source task can be near optimal in target domain provided an adaptive policy accounts for the model error between target and source. The main benefit of this policy augmentation is generalizing policies across multiple related domains without having to re-learn the new tasks. Our results show that this architecture leads to better sample efficiency in the transfer, reducing sample complexity of target task learning to target apprentice learning.Comment: To appear as conference paper in ICRA 201

arXiv.org e-Print Archive

Application of Machine Learning in Wireless Networks: Key Techniques and Open Issues

Author: Huang Yuzhe
Mao Shiwen
Peng Mugen
Sun Yaohua
Zhou Yangcheng
Publication venue
Publication date: 28/02/2019
Field of study

As a key technique for enabling artificial intelligence, machine learning (ML) is capable of solving complex problems without explicit programming. Motivated by its successful applications to many practical tasks like image recognition, both industry and the research community have advocated the applications of ML in wireless communication. This paper comprehensively surveys the recent advances of the applications of ML in wireless communication, which are classified as: resource management in the MAC layer, networking and mobility management in the network layer, and localization in the application layer. The applications in resource management further include power control, spectrum management, backhaul management, cache management, beamformer design and computation resource management, while ML based networking focuses on the applications in clustering, base station switching control, user association and routing. Moreover, literatures in each aspect is organized according to the adopted ML techniques. In addition, several conditions for applying ML to wireless communication are identified to help readers decide whether to use ML and which kind of ML techniques to use, and traditional approaches are also summarized together with their performance comparison with ML based approaches, based on which the motivations of surveyed literatures to adopt ML are clarified. Given the extensiveness of the research area, challenges and unresolved issues are presented to facilitate future studies, where ML based network slicing, infrastructure update to support ML based paradigms, open data sets and platforms for researchers, theoretical guidance for ML implementation and so on are discussed.Comment: 34 pages,8 figure

arXiv.org e-Print Archive

A Tour of Reinforcement Learning: The View from Continuous Control

Author: Recht Benjamin
Publication venue
Publication date: 10/11/2018
Field of study

This manuscript surveys reinforcement learning from the perspective of optimization and control with a focus on continuous control applications. It surveys the general formulation, terminology, and typical experimental implementations of reinforcement learning and reviews competing solution paradigms. In order to compare the relative merits of various techniques, this survey presents a case study of the Linear Quadratic Regulator (LQR) with unknown dynamics, perhaps the simplest and best-studied problem in optimal control. The manuscript describes how merging techniques from learning theory and control can provide non-asymptotic characterizations of LQR performance and shows that these characterizations tend to match experimental behavior. In turn, when revisiting more complex applications, many of the observed phenomena in LQR persist. In particular, theory and experiment demonstrate the role and importance of models and the cost of generality in reinforcement learning algorithms. This survey concludes with a discussion of some of the challenges in designing learning systems that safely and reliably interact with complex and uncertain environments and how tools from reinforcement learning and control might be combined to approach these challenges.Comment: minor revision with a few clarifying passages and corrected typo

arXiv.org e-Print Archive

Flow: A Modular Learning Framework for Autonomy in Traffic

Author: Bayen Alexandre M
Kreidieh Aboudy
Parvate Kanaad
Vinitsky Eugene
Wu Cathy
Publication venue
Publication date: 29/12/2020
Field of study

The rapid development of autonomous vehicles (AVs) holds vast potential for transportation systems through improved safety, efficiency, and access to mobility. However, due to numerous technical, political, and human factors challenges, new methodologies are needed to design vehicles and transportation systems for these positive outcomes. This article tackles technical challenges arising from the partial adoption of autonomy: partial control, partial observation, complex multi-vehicle interactions, and the sheer variety of traffic settings represented by real-world networks. The article presents a modular learning framework which leverages deep Reinforcement Learning methods to address complex traffic dynamics. Modules are composed to capture common traffic phenomena (traffic jams, lane changing, intersections). Learned control laws are found to exceed human driving performance by at least 40% with only 5-10% adoption of AVs. In partially-observed single-lane traffic, a small neural network control law can eliminate stop-and-go traffic -- surpassing all known model-based controllers, achieving near-optimal performance, and generalizing to out-of-distribution traffic densities.Comment: 14 pages, 8 figures; new experiments and analysi

arXiv.org e-Print Archive

Multi-Task Generative Adversarial Nets with Shared Memory for Cross-Domain Coordination Control

Author: Duan ShiHui
Shi YouKang
Thomas Ian
Wang JunPing
Zhang WenSheng
Publication venue
Publication date: 01/07/2018
Field of study

Generating sequential decision process from huge amounts of measured process data is a future research direction for collaborative factory automation, making full use of those online or offline process data to directly design flexible make decisions policy, and evaluate performance. The key challenges for the sequential decision process is to online generate sequential decision-making policy directly, and transferring knowledge across tasks domain. Most multi-task policy generating algorithms often suffer from insufficient generating cross-task sharing structure at discrete-time nonlinear systems with applications. This paper proposes the multi-task generative adversarial nets with shared memory for cross-domain coordination control, which can generate sequential decision policy directly from raw sensory input of all of tasks, and online evaluate performance of system actions in discrete-time nonlinear systems. Experiments have been undertaken using a professional flexible manufacturing testbed deployed within a smart factory of Weichai Power in China. Results on three groups of discrete-time nonlinear control tasks show that our proposed model can availably improve the performance of task with the help of other related tasks

arXiv.org e-Print Archive

Towards a Hands-Free Query Optimizer through Deep Learning

Author: Marcus Ryan
Papaemmanouil Olga
Publication venue
Publication date: 17/12/2018
Field of study

Query optimization remains one of the most important and well-studied problems in database systems. However, traditional query optimizers are complex heuristically-driven systems, requiring large amounts of time to tune for a particular database and requiring even more time to develop and maintain in the first place. In this vision paper, we argue that a new type of query optimizer, based on deep reinforcement learning, can drastically improve on the state-of-the-art. We identify potential complications for future research that integrates deep learning with query optimization, and we describe three novel deep learning based approaches that can lead the way to end-to-end learning-based query optimizers.Comment: Published in CIDR1

arXiv.org e-Print Archive