21 research outputs found

    Learning Team-Based Navigation: A Review of Deep Reinforcement Learning Techniques for Multi-Agent Pathfinding

    Full text link
    Multi-agent pathfinding (MAPF) is a critical field in many large-scale robotic applications, often being the fundamental step in multi-agent systems. The increasing complexity of MAPF in complex and crowded environments, however, critically diminishes the effectiveness of existing solutions. In contrast to other studies that have either presented a general overview of the recent advancements in MAPF or extensively reviewed Deep Reinforcement Learning (DRL) within multi-agent system settings independently, our work presented in this review paper focuses on highlighting the integration of DRL-based approaches in MAPF. Moreover, we aim to bridge the current gap in evaluating MAPF solutions by addressing the lack of unified evaluation metrics and providing comprehensive clarification on these metrics. Finally, our paper discusses the potential of model-based DRL as a promising future direction and provides its required foundational understanding to address current challenges in MAPF. Our objective is to assist readers in gaining insight into the current research direction, providing unified metrics for comparing different MAPF algorithms and expanding their knowledge of model-based DRL to address the existing challenges in MAPF.Comment: 36 pages, 10 figures, published in Artif Intell Rev 57, 41 (2024

    Policy optimization for industrial benchmark using deep reinforcement learning

    Get PDF
    2020 Summer.Includes bibliographical references.Significant advancements have been made in the field of Reinforcement Learning (RL) in recent decades. Numerous novel RL environments and algorithms are mastering these problems that have been studied, evaluated, and published. The most popular RL benchmark environments produced by OpenAI Gym and DeepMind Labs are modeled after single/multi-player board, video games, or single-purpose robots and the RL algorithms modeling optimal policies for playing those games have even outperformed humans in almost all of them. However, the real-world applications using RL is very limited, as the academic community has limited access to real industrial data and applications. Industrial Benchmark (IB) is a novel RL benchmark motivated by Industrial Control problems with properties such as continuous state and action spaces, high dimensionality, partially observable state space, delayed effects combined with complex heteroscedastic stochastic behavior. We have used Deep Reinforcement Learning (DRL) algorithms like Deep Q-Networks (DQN) and Double-DQN (DDQN) to study and model optimal policies on IB. Our empirical results show various DRL models outperforming previously published models on the same IB

    Secure Energy Aware Optimal Routing using Reinforcement Learning-based Decision-Making with a Hybrid Optimization Algorithm in MANET

    Get PDF
    Mobile ad hoc networks (MANETs) are wireless networks that are perfect for applications such as special outdoor events, communications in areas without wireless infrastructure, crises and natural disasters, and military activities because they do not require any preexisting network infrastructure and can be deployed quickly. Mobile ad hoc networks can be made to last longer through the use of clustering, which is one of the most effective uses of energy. Security is a key issue in the development of ad hoc networks. Many studies have been conducted on how to reduce the energy expenditure of the nodes in this network. The majority of these approaches might conserve energy and extend the life of the nodes. The major goal of this research is to develop an energy-aware, secure mechanism for MANETs. Secure Energy Aware Reinforcement Learning based Decision Making with Hybrid Optimization Algorithm (RL-DMHOA) is proposed for detecting the malicious node in the network. With the assistance of the optimization algorithm, data can be transferred more efficiently by choosing aggregation points that allow individual nodes to conserve power The optimum path is chosen by combining the Particle Swarm Optimization (PSO) and the Bat Algorithm (BA) to create a fitness function that maximizes across-cluster distance, delay, and node energy. Three state-of-the-art methods are compared to the suggested method on a variety of metrics. Throughput of 94.8 percent, average latency of 28.1 percent, malicious detection rate of 91.4 percent, packet delivery ratio of 92.4 percent, and network lifetime of 85.2 percent are all attained with the suggested RL-DMHOA approach

    Deep Reinforcement Learning for Smart Energy Networks

    Get PDF
    To reduce global greenhouse gas emissions, the world must find intelligent solutions to maximise the utilisation of carbon-free renewable energy sources (RES). Energy storage systems (ESS) can be used to store energy when RES generation exceeds demand to then be discharged later at peak times to maximise utilisation of the RES, as well as profit from dynamic energy prices to perform energy arbitrage. Both RES and ESSs are difficult to implement at large scales but can be applied in localised microgrids that trade with the main utility grid. However, these microgrids require an intelligent energy management system able to account for the intermittent RES, fluctuating demand, and volatile dynamic energy prices. For this, the use of reinforcement learning (RL) in which a control agent learns to interact with its environment to maximise a reward is investigated. RL agents can learn to control ESSs with incomplete information of the environment, ideal for energy networks with complex and potentially unknown dynamics difficult to model and solve by heuristic optimisation methods. Although the use of RL for ESS control in smart energy networks has increased over the past decade, many of the state-ofthe- art algorithms in RL have yet to be applied to smart energy network applications, meaning researchers may be missing considerable performance benefits. In this thesis, a microgrid environment is designed for RL agent training using demand and weather data collected from Keele University as well as dynamic energy prices from real wholesale markets to train agents in both the aims of RES integration and energy arbitrage. Variants of this environment are used to evaluate different RL algorithms for RES integration and energy arbitrage where sample efficiency is key due to the limited amount of data available to train from. The findings showed that RL is able to learn effective control policies for ESS control. In particular, the off-policy methods Deep Q-Networks (DQN) and Deep Deterministic Policy Gradients (DDPG) were able to achieve good performance as using an experience replay buffer to reuse transitions provided much better sample efficiency. By investigating different types of action-space, it was found that using functional actions which vary depending on RES output allowed the discrete control of DQN to match and surpass the performance of the continuous control DDPG. The use of the Rainbow algorithm - an advancement over DQN - was applied to an energy arbitrage problem. The method is notable for having good sample efficiency qualities, which is important for this work in which agents only have a limited amount of data to learn from. The use of a distributional value function estimate was novel in the field of smart energy application, with only scalar estimates used in literature. The results found that Rainbow and its component C51 performed the best due to this distributional value function, which allows the agent to capture the stochasticity of the environment. Finally, multi-agent RL is used to cooperatively control different types of electrical ESS in a hybrid ESS (HESS), as well as trade with self-interested external microgrids looking to reduce their own energy bills. Different single-agent and multi-agent approaches were tested using variants of DDPG and Multi-Agent DDPG (MADDG) to assess if the energy network should be managed by a single centralised controller or multiple distributed agents. The results found that the multi-agent approaches performed the best due to providing each component agent its own reward function based on marginal contribution, allowing it to assess their own individual performance within the wider system

    A survey on autonomous environmental monitoring approaches: towards unifying active sensing and reinforcement learning

    Get PDF
    The environmental pollution caused by various sources has escalated the climate crisis making the need to establish reliable, intelligent, and persistent environmental monitoring solutions more crucial than ever. Mobile sensing systems are a popular platform due to their cost-effectiveness and adaptability. However, in practice, operation environments demand highly intelligent and robust systems that can cope with an environment’s changing dynamics. To achieve this reinforcement learning has become a popular tool as it facilitates the training of intelligent and robust sensing agents that can handle unknown and extreme conditions. In this paper, a framework that formulates active sensing as a reinforcement learning problem is proposed. This framework allows unification with multiple essential environmental monitoring tasks and algorithms such as coverage, patrolling, source seeking, exploration and search and rescue. The unified framework represents a step towards bridging the divide between theoretical advancements in reinforcement learning and real-world applications in environmental monitoring. A critical review of the literature in this field is carried out and it is found that despite the potential of reinforcement learning for environmental active sensing applications there is still a lack of practical implementation and most work remains in the simulation phase. It is also noted that despite the consensus that, multi-agent systems are crucial to fully realize the potential of active sensing there is a lack of research in this area

    A Comprehensive Overview on 5G-and-Beyond Networks with UAVs: From Communications to Sensing and Intelligence

    Full text link
    Due to the advancements in cellular technologies and the dense deployment of cellular infrastructure, integrating unmanned aerial vehicles (UAVs) into the fifth-generation (5G) and beyond cellular networks is a promising solution to achieve safe UAV operation as well as enabling diversified applications with mission-specific payload data delivery. In particular, 5G networks need to support three typical usage scenarios, namely, enhanced mobile broadband (eMBB), ultra-reliable low-latency communications (URLLC), and massive machine-type communications (mMTC). On the one hand, UAVs can be leveraged as cost-effective aerial platforms to provide ground users with enhanced communication services by exploiting their high cruising altitude and controllable maneuverability in three-dimensional (3D) space. On the other hand, providing such communication services simultaneously for both UAV and ground users poses new challenges due to the need for ubiquitous 3D signal coverage as well as the strong air-ground network interference. Besides the requirement of high-performance wireless communications, the ability to support effective and efficient sensing as well as network intelligence is also essential for 5G-and-beyond 3D heterogeneous wireless networks with coexisting aerial and ground users. In this paper, we provide a comprehensive overview of the latest research efforts on integrating UAVs into cellular networks, with an emphasis on how to exploit advanced techniques (e.g., intelligent reflecting surface, short packet transmission, energy harvesting, joint communication and radar sensing, and edge intelligence) to meet the diversified service requirements of next-generation wireless systems. Moreover, we highlight important directions for further investigation in future work.Comment: Accepted by IEEE JSA

    AI meets CRNs : a prospective review on the application of deep architectures in spectrum management

    Get PDF
    The spectrum low utilization and high demand conundrum created a bottleneck towards ful lling the requirements of next-generation networks. The cognitive radio (CR) technology was advocated as a de facto technology to alleviate the scarcity and under-utilization of spectrum resources by exploiting temporarily vacant spectrum holes of the licensed spectrum bands. As a result, the CR technology became the rst step towards the intelligentization of mobile and wireless networks, and in order to strengthen its intelligent operation, the cognitive engine needs to be enhanced through the exploitation of arti cial intelligence (AI) strategies. Since comprehensive literature reviews covering the integration and application of deep architectures in cognitive radio networks (CRNs) are still lacking, this article aims at lling the gap by presenting a detailed review that addresses the integration of deep architectures into the intricacies of spectrum management. This is a prospective review whose primary objective is to provide an in-depth exploration of the recent trends in AI strategies employed in mobile and wireless communication networks. The existing reviews in this area have not considered the relevance of incorporating the mathematical fundamentals of each AI strategy and how to tailor them to speci c mobile and wireless networking problems. Therefore, this reviewaddresses that problem by detailing howdeep architectures can be integrated into spectrum management problems. Beyond reviewing different ways in which deep architectures can be integrated into spectrum management, model selection strategies and how different deep architectures can be tailored into the CR space to achieve better performance in complex environments are then reported in the context of future research directions.The Sentech Chair in Broadband Wireless Multimedia Communications (BWMC) at the University of Pretoria.http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639am2022Electrical, Electronic and Computer Engineerin
    corecore