21 research outputs found
Learning Team-Based Navigation: A Review of Deep Reinforcement Learning Techniques for Multi-Agent Pathfinding
Multi-agent pathfinding (MAPF) is a critical field in many large-scale
robotic applications, often being the fundamental step in multi-agent systems.
The increasing complexity of MAPF in complex and crowded environments, however,
critically diminishes the effectiveness of existing solutions. In contrast to
other studies that have either presented a general overview of the recent
advancements in MAPF or extensively reviewed Deep Reinforcement Learning (DRL)
within multi-agent system settings independently, our work presented in this
review paper focuses on highlighting the integration of DRL-based approaches in
MAPF. Moreover, we aim to bridge the current gap in evaluating MAPF solutions
by addressing the lack of unified evaluation metrics and providing
comprehensive clarification on these metrics. Finally, our paper discusses the
potential of model-based DRL as a promising future direction and provides its
required foundational understanding to address current challenges in MAPF. Our
objective is to assist readers in gaining insight into the current research
direction, providing unified metrics for comparing different MAPF algorithms
and expanding their knowledge of model-based DRL to address the existing
challenges in MAPF.Comment: 36 pages, 10 figures, published in Artif Intell Rev 57, 41 (2024
Policy optimization for industrial benchmark using deep reinforcement learning
2020 Summer.Includes bibliographical references.Significant advancements have been made in the field of Reinforcement Learning (RL) in recent decades. Numerous novel RL environments and algorithms are mastering these problems that have been studied, evaluated, and published. The most popular RL benchmark environments produced by OpenAI Gym and DeepMind Labs are modeled after single/multi-player board, video games, or single-purpose robots and the RL algorithms modeling optimal policies for playing those games have even outperformed humans in almost all of them. However, the real-world applications using RL is very limited, as the academic community has limited access to real industrial data and applications. Industrial Benchmark (IB) is a novel RL benchmark motivated by Industrial Control problems with properties such as continuous state and action spaces, high dimensionality, partially observable state space, delayed effects combined with complex heteroscedastic stochastic behavior. We have used Deep Reinforcement Learning (DRL) algorithms like Deep Q-Networks (DQN) and Double-DQN (DDQN) to study and model optimal policies on IB. Our empirical results show various DRL models outperforming previously published models on the same IB
Secure Energy Aware Optimal Routing using Reinforcement Learning-based Decision-Making with a Hybrid Optimization Algorithm in MANET
Mobile ad hoc networks (MANETs) are wireless networks that are perfect for applications such as special outdoor events, communications in areas without wireless infrastructure, crises and natural disasters, and military activities because they do not require any preexisting network infrastructure and can be deployed quickly. Mobile ad hoc networks can be made to last longer through the use of clustering, which is one of the most effective uses of energy. Security is a key issue in the development of ad hoc networks. Many studies have been conducted on how to reduce the energy expenditure of the nodes in this network. The majority of these approaches might conserve energy and extend the life of the nodes. The major goal of this research is to develop an energy-aware, secure mechanism for MANETs. Secure Energy Aware Reinforcement Learning based Decision Making with Hybrid Optimization Algorithm (RL-DMHOA) is proposed for detecting the malicious node in the network. With the assistance of the optimization algorithm, data can be transferred more efficiently by choosing aggregation points that allow individual nodes to conserve power The optimum path is chosen by combining the Particle Swarm Optimization (PSO) and the Bat Algorithm (BA) to create a fitness function that maximizes across-cluster distance, delay, and node energy. Three state-of-the-art methods are compared to the suggested method on a variety of metrics. Throughput of 94.8 percent, average latency of 28.1 percent, malicious detection rate of 91.4 percent, packet delivery ratio of 92.4 percent, and network lifetime of 85.2 percent are all attained with the suggested RL-DMHOA approach
Deep Reinforcement Learning for Smart Energy Networks
To reduce global greenhouse gas emissions, the world must find intelligent solutions to maximise the utilisation of carbon-free renewable energy sources (RES). Energy storage systems (ESS) can be used to store energy when RES generation exceeds demand to then be discharged later at peak times to maximise utilisation of the RES, as well as profit from dynamic energy prices to perform energy arbitrage. Both RES and ESSs are difficult to implement at large scales but can be applied in localised microgrids that trade with the main utility grid. However, these microgrids require an intelligent energy management system able to account for the intermittent RES, fluctuating demand, and volatile dynamic energy prices. For this, the use of reinforcement learning (RL) in which a control agent learns to interact with its environment to maximise a reward is investigated. RL agents can learn to control ESSs with incomplete information of the environment, ideal for energy networks with complex and potentially unknown dynamics difficult to model and solve by heuristic optimisation methods. Although the use of RL for ESS control in smart energy networks has increased over the past decade, many of the state-ofthe- art algorithms in RL have yet to be applied to smart energy network applications, meaning researchers may be missing considerable performance benefits. In this thesis, a microgrid environment is designed for RL agent training using demand and weather data collected from Keele University as well as dynamic energy prices from real wholesale markets to train agents in both the aims of RES integration and energy arbitrage. Variants of this environment are used to evaluate different RL algorithms for RES integration and energy arbitrage where sample efficiency is key due to the limited amount of data available to train from. The findings showed that RL is able to learn effective control policies for ESS control. In particular, the off-policy methods Deep Q-Networks (DQN) and Deep Deterministic Policy Gradients (DDPG) were able to achieve good performance as using an experience replay buffer to reuse transitions provided much better sample efficiency. By investigating different types of action-space, it was found that using functional actions which vary depending on RES output allowed the discrete control of DQN to match and surpass the performance of the continuous control DDPG. The use of the Rainbow algorithm - an advancement over DQN - was applied to an energy arbitrage problem. The method is notable for having good sample efficiency qualities, which is important for this work in which agents only have a limited amount of data to learn from. The use of a distributional value function estimate was novel in the field of smart energy application, with only scalar estimates used in literature. The results found that Rainbow and its component C51 performed the best due to this distributional value function, which allows the agent to capture the stochasticity of the environment. Finally, multi-agent RL is used to cooperatively control different types of electrical ESS in a hybrid ESS (HESS), as well as trade with self-interested external microgrids looking to reduce their own energy bills. Different single-agent and multi-agent approaches were tested using variants of DDPG and Multi-Agent DDPG (MADDG) to assess if the energy network should be managed by a single centralised controller or multiple distributed agents. The results found that the multi-agent approaches performed the best due to providing each component agent its own reward function based on marginal contribution, allowing it to assess their own individual performance within the wider system
A survey on autonomous environmental monitoring approaches: towards unifying active sensing and reinforcement learning
The environmental pollution caused by various sources has escalated the climate crisis making the need to establish reliable, intelligent, and persistent environmental monitoring solutions more crucial than ever. Mobile sensing systems are a popular platform due to their cost-effectiveness and adaptability. However, in practice, operation environments demand highly intelligent and robust systems that can cope with an environment’s changing dynamics. To achieve this reinforcement learning has become a popular tool as it facilitates the training of intelligent and robust sensing agents that can handle unknown and extreme conditions. In this paper, a framework that formulates active sensing as a reinforcement learning problem is proposed. This framework allows unification with multiple essential environmental monitoring tasks and algorithms such as coverage, patrolling, source seeking, exploration and search and rescue. The unified framework represents a step towards bridging the divide between theoretical advancements in reinforcement learning and real-world applications in environmental monitoring. A critical review of the literature in this field is carried out and it is found that despite the potential of reinforcement learning for environmental active sensing applications there is still a lack of practical implementation and most work remains in the simulation phase. It is also noted that despite the consensus that, multi-agent systems are crucial to fully realize the potential of active sensing there is a lack of research in this area
A Comprehensive Overview on 5G-and-Beyond Networks with UAVs: From Communications to Sensing and Intelligence
Due to the advancements in cellular technologies and the dense deployment of
cellular infrastructure, integrating unmanned aerial vehicles (UAVs) into the
fifth-generation (5G) and beyond cellular networks is a promising solution to
achieve safe UAV operation as well as enabling diversified applications with
mission-specific payload data delivery. In particular, 5G networks need to
support three typical usage scenarios, namely, enhanced mobile broadband
(eMBB), ultra-reliable low-latency communications (URLLC), and massive
machine-type communications (mMTC). On the one hand, UAVs can be leveraged as
cost-effective aerial platforms to provide ground users with enhanced
communication services by exploiting their high cruising altitude and
controllable maneuverability in three-dimensional (3D) space. On the other
hand, providing such communication services simultaneously for both UAV and
ground users poses new challenges due to the need for ubiquitous 3D signal
coverage as well as the strong air-ground network interference. Besides the
requirement of high-performance wireless communications, the ability to support
effective and efficient sensing as well as network intelligence is also
essential for 5G-and-beyond 3D heterogeneous wireless networks with coexisting
aerial and ground users. In this paper, we provide a comprehensive overview of
the latest research efforts on integrating UAVs into cellular networks, with an
emphasis on how to exploit advanced techniques (e.g., intelligent reflecting
surface, short packet transmission, energy harvesting, joint communication and
radar sensing, and edge intelligence) to meet the diversified service
requirements of next-generation wireless systems. Moreover, we highlight
important directions for further investigation in future work.Comment: Accepted by IEEE JSA
AI meets CRNs : a prospective review on the application of deep architectures in spectrum management
The spectrum low utilization and high demand conundrum created a bottleneck towards
ful lling the requirements of next-generation networks. The cognitive radio (CR) technology was advocated
as a de facto technology to alleviate the scarcity and under-utilization of spectrum resources by exploiting
temporarily vacant spectrum holes of the licensed spectrum bands. As a result, the CR technology became
the rst step towards the intelligentization of mobile and wireless networks, and in order to strengthen
its intelligent operation, the cognitive engine needs to be enhanced through the exploitation of arti cial
intelligence (AI) strategies. Since comprehensive literature reviews covering the integration and application
of deep architectures in cognitive radio networks (CRNs) are still lacking, this article aims at lling the
gap by presenting a detailed review that addresses the integration of deep architectures into the intricacies
of spectrum management. This is a prospective review whose primary objective is to provide an in-depth
exploration of the recent trends in AI strategies employed in mobile and wireless communication networks.
The existing reviews in this area have not considered the relevance of incorporating the mathematical
fundamentals of each AI strategy and how to tailor them to speci c mobile and wireless networking
problems. Therefore, this reviewaddresses that problem by detailing howdeep architectures can be integrated
into spectrum management problems. Beyond reviewing different ways in which deep architectures can be
integrated into spectrum management, model selection strategies and how different deep architectures can
be tailored into the CR space to achieve better performance in complex environments are then reported in
the context of future research directions.The Sentech Chair in Broadband Wireless Multimedia Communications (BWMC) at the University of Pretoria.http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639am2022Electrical, Electronic and Computer Engineerin