Search CORE

25,109 research outputs found

Reinforcement Learning applied to Single Neuron

Author: Cai Mingbo
Wang Zhipeng
Publication venue
Publication date: 15/05/2015
Field of study

This paper extends the reinforcement learning ideas into the multi-agents system, which is far more complicated than the previously studied single-agent system. We studied two different multi-agents systems. One is the fully-connected neural network consists of multiple single neurons. Another one is the simplified mechanical arm system which is controlled by multiple neurons. We suppose that each neuron is like an agent and it can do Gibbs sampling of the posterior probability of stimulus features. The policy is optimized in a way that the cumulative global rewards are maximized. The algorithm for the second system is based on the same idea but we incorporate the physics model into the constraints. The simulation results show that for the first system our algorithm converges well. For the second system it does not converge well in a reasonable simulation time length. In summary, we took the initial endeavor to study the reinforcement learning for multi-agents system. The computational complexity is always an issue and significant amount of works have to be done in order to better understand the problem

arXiv.org e-Print Archive

Metis: Multi-Agent Based Crisis Simulation System

Author: Kiourt Chairi
Moussiades Lefteris
Sidiropoulos George
Publication venue
Publication date: 08/09/2020
Field of study

With the advent of the computational technologies (Graphics Processing Units - GPUs) and Machine Learning, the research domain of crowd simulation for crisis management has flourished. Along with the new techniques and methodologies that have been proposed all those years, aiming to increase the realism of crowd simulation, several crisis simulation systems/tools have been developed, but most of them focus on special cases without providing users the ability to adapt them based on their needs. Towards these directions, in this paper, we introduce a novel multi-agent-based crisis simulation system for indoor cases. The main advantage of the system is its ease of use feature, focusing on non-expert users (users with little to no programming skills) that can exploit its capabilities a, adapt the entire environment based on their needs (Case studies) and set up building evacuation planning experiments with some of the most popular Reinforcement Learning algorithms. Simply put, the system's features focus on dynamic environment design and crisis management, interconnection with popular Reinforcement Learning libraries, agents with different characteristics (behaviors), fire propagation parameterization, realistic physics based on popular game engine, GPU-accelerated agents training and simulation end conditions. A case study exploiting a popular reinforcement learning algorithm, for training of the agents, presents the dynamics and the capabilities of the proposed systems and the paper is concluded with the highlights of the system and some future directions

arXiv.org e-Print Archive

Autonomous Air Traffic Controller: A Deep Multi-Agent Reinforcement Learning Approach

Author: Brittain Marc
Wei Peng
Publication venue
Publication date: 02/05/2019
Field of study

Air traffic control is a real-time safety-critical decision making process in highly dynamic and stochastic environments. In today's aviation practice, a human air traffic controller monitors and directs many aircraft flying through its designated airspace sector. With the fast growing air traffic complexity in traditional (commercial airliners) and low-altitude (drones and eVTOL aircraft) airspace, an autonomous air traffic control system is needed to accommodate high density air traffic and ensure safe separation between aircraft. We propose a deep multi-agent reinforcement learning framework that is able to identify and resolve conflicts between aircraft in a high-density, stochastic, and dynamic en-route sector with multiple intersections and merging points. The proposed framework utilizes an actor-critic model, A2C that incorporates the loss function from Proximal Policy Optimization (PPO) to help stabilize the learning process. In addition we use a centralized learning, decentralized execution scheme where one neural network is learned and shared by all agents in the environment. We show that our framework is both scalable and efficient for large number of incoming aircraft to achieve extremely high traffic throughput with safety guarantee. We evaluate our model via extensive simulations in the BlueSky environment. Results show that our framework is able to resolve 99.97% and 100% of all conflicts both at intersections and merging points, respectively, in extreme high-density air traffic scenarios.Comment: 10 page

arXiv.org e-Print Archive

Fuzzy Q-Learning Based Multi-Agent System for Intelligent Traffic Control by a Game Theory Approach

Author: Daeichian Abolghasem
Haghani Amir
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/04/2019
Field of study

This paper introduces a multi-agent approach to adjust traffic lights based on traffic situation in order to reduce average delay time. In the traffic model, lights of each intersection are controlled by an autonomous agent. Since decision of each agent affects neighbor agents, this approach creates a classical non-stationary environment. Thus, each agent not only needs to learn from the past experience but also has to consider decision of neighbors to overcome dynamic changes of the traffic network. Fuzzy Q-learning and Game theory are employed to make policy based on previous experiences and decision of neighbor agents. Simulation results illustrate the advantage of the proposed method over fixed time, fuzzy, Q-learning and fuzzy Q-learning control methods.Comment: 10 pages, 10 figure

arXiv.org e-Print Archive

Intelligent Residential Energy Management System using Deep Reinforcement Learning

Author: Mathew Alwyn
Mathew Jimson
Roy Abhijit
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/05/2020
Field of study

The rising demand for electricity and its essential nature in today's world calls for intelligent home energy management (HEM) systems that can reduce energy usage. This involves scheduling of loads from peak hours of the day when energy consumption is at its highest to leaner off-peak periods of the day when energy consumption is relatively lower thereby reducing the system's peak load demand, which would consequently result in lesser energy bills, and improved load demand profile. This work introduces a novel way to develop a learning system that can learn from experience to shift loads from one time instance to another and achieve the goal of minimizing the aggregate peak load. This paper proposes a Deep Reinforcement Learning (DRL) model for demand response where the virtual agent learns the task like humans do. The agent gets feedback for every action it takes in the environment; these feedbacks will drive the agent to learn about the environment and take much smarter steps later in its learning stages. Our method outperformed the state of the art mixed integer linear programming (MILP) for load peak reduction. The authors have also designed an agent to learn to minimize both consumers' electricity bills and utilities' system peak load demand simultaneously. The proposed model was analyzed with loads from five different residential consumers; the proposed method increases the monthly savings of each consumer by reducing their electricity bill drastically along with minimizing the peak load on the system when time shiftable loads are handled by the proposed method

arXiv.org e-Print Archive

Applications of Deep Reinforcement Learning in Communications and Networking: A Survey

Author: Gong Shimin
Hoang Dinh Thai
Kim Dong In
Liang Ying-Chang
Luong Nguyen Cong
Niyato Dusit
Wang Ping
Publication venue
Publication date: 17/10/2018
Field of study

This paper presents a comprehensive literature review on applications of deep reinforcement learning in communications and networking. Modern networks, e.g., Internet of Things (IoT) and Unmanned Aerial Vehicle (UAV) networks, become more decentralized and autonomous. In such networks, network entities need to make decisions locally to maximize the network performance under uncertainty of network environment. Reinforcement learning has been efficiently used to enable the network entities to obtain the optimal policy including, e.g., decisions or actions, given their states when the state and action spaces are small. However, in complex and large-scale networks, the state and action spaces are usually large, and the reinforcement learning may not be able to find the optimal policy in reasonable time. Therefore, deep reinforcement learning, a combination of reinforcement learning with deep learning, has been developed to overcome the shortcomings. In this survey, we first give a tutorial of deep reinforcement learning from fundamental concepts to advanced models. Then, we review deep reinforcement learning approaches proposed to address emerging issues in communications and networking. The issues include dynamic network access, data rate control, wireless caching, data offloading, network security, and connectivity preservation which are all important to next generation networks such as 5G and beyond. Furthermore, we present applications of deep reinforcement learning for traffic routing, resource sharing, and data collection. Finally, we highlight important challenges, open issues, and future research directions of applying deep reinforcement learning.Comment: 37 pages, 13 figures, 6 tables, 174 reference paper

arXiv.org e-Print Archive

MARL-FWC: Optimal Coordination of Freeway Traffic Control Measures

Author: Fares Ahmed
Gomaa Walid
Khamis Mohamed A.
Publication venue
Publication date: 27/08/2018
Field of study

The objective of this article is to optimize the overall traffic flow on freeways using multiple ramp metering controls plus its complementary Dynamic Speed Limits (DSLs). An optimal freeway operation can be reached when minimizing the difference between the freeway density and the critical ratio for maximum traffic flow. In this article, a Multi-Agent Reinforcement Learning for Freeways Control (MARL-FWC) system for ramps metering and DSLs is proposed. MARL-FWC introduces a new microscopic framework at the network level based on collaborative Markov Decision Process modeling (Markov game) and an associated cooperative Q-learning algorithm. The technique incorporates payoff propagation (Max-Plus algorithm) under the coordination graphs framework, particularly suited for optimal control purposes. MARL-FWC provides three control designs: fully independent, fully distributed, and centralized; suited for different network architectures. MARL-FWC was extensively tested in order to assess the proposed model of the joint payoff, as well as the global payoff. Experiments are conducted with heavy traffic flow under the renowned VISSIM traffic simulator to evaluate MARL-FWC. The experimental results show a significant decrease in the total travel time and an increase in the average speed (when compared with the base case) while maintaining an optimal traffic flow

arXiv.org e-Print Archive

Toward Packet Routing with Fully-distributed Multi-agent Deep Reinforcement Learning

Author: Feng Hui
Li Xuanjie
Xu Yuedong
Yan Huaicheng
You Xinyu
Zhao Jin
Publication venue
Publication date: 14/11/2019
Field of study

Packet routing is one of the fundamental problems in computer networks in which a router determines the next-hop of each packet in the queue to get it as quickly as possible to its destination. Reinforcement learning (RL) has been introduced to design autonomous packet routing policies with local information of stochastic packet arrival and service. However, the curse of dimensionality of RL prohibits the more comprehensive representation of dynamic network states, thus limiting its potential benefit. In this paper, we propose a novel packet routing framework based on \emph{multi-agent} deep reinforcement learning (DRL) in which each router possess an \emph{independent} LSTM recurrent neural network for training and decision making in a \emph{fully distributed} environment. The LSTM recurrent neural network extracts routing features from rich information regarding backlogged packets and past actions, and effectively approximates the value function of Q-learning. We further allow each route to communicate periodically with direct neighbors so that a broader view of network state can be incorporated. Experimental results manifest that our multi-agent DRL policy can strike the delicate balance between congestion-aware and shortest routes, and significantly reduce the packet delivery time in general network topologies compared with its counterparts.Comment: 12 pages, 10 figure

arXiv.org e-Print Archive

Application of Machine Learning in Wireless Networks: Key Techniques and Open Issues

Author: Huang Yuzhe
Mao Shiwen
Peng Mugen
Sun Yaohua
Zhou Yangcheng
Publication venue
Publication date: 28/02/2019
Field of study

As a key technique for enabling artificial intelligence, machine learning (ML) is capable of solving complex problems without explicit programming. Motivated by its successful applications to many practical tasks like image recognition, both industry and the research community have advocated the applications of ML in wireless communication. This paper comprehensively surveys the recent advances of the applications of ML in wireless communication, which are classified as: resource management in the MAC layer, networking and mobility management in the network layer, and localization in the application layer. The applications in resource management further include power control, spectrum management, backhaul management, cache management, beamformer design and computation resource management, while ML based networking focuses on the applications in clustering, base station switching control, user association and routing. Moreover, literatures in each aspect is organized according to the adopted ML techniques. In addition, several conditions for applying ML to wireless communication are identified to help readers decide whether to use ML and which kind of ML techniques to use, and traditional approaches are also summarized together with their performance comparison with ML based approaches, based on which the motivations of surveyed literatures to adopt ML are clarified. Given the extensiveness of the research area, challenges and unresolved issues are presented to facilitate future studies, where ML based network slicing, infrastructure update to support ML based paradigms, open data sets and platforms for researchers, theoretical guidance for ML implementation and so on are discussed.Comment: 34 pages,8 figure

arXiv.org e-Print Archive

A Review of Reinforcement Learning for Autonomous Building Energy Management

Author: Grijalva Santiago
Mason Karl
Publication venue
Publication date: 15/03/2019
Field of study

The area of building energy management has received a significant amount of interest in recent years. This area is concerned with combining advancements in sensor technologies, communications and advanced control algorithms to optimize energy utilization. Reinforcement learning is one of the most prominent machine learning algorithms used for control problems and has had many successful applications in the area of building energy management. This research gives a comprehensive review of the literature relating to the application of reinforcement learning to developing autonomous building energy management systems. The main direction for future research and challenges in reinforcement learning are also outlined.Comment: 17 pages, 3 figure

arXiv.org e-Print Archive