1,439 research outputs found
UAV Path Planning for Wireless Data Harvesting: A Deep Reinforcement Learning Approach
Autonomous deployment of unmanned aerial vehicles (UAVs) supporting
next-generation communication networks requires efficient trajectory planning
methods. We propose a new end-to-end reinforcement learning (RL) approach to
UAV-enabled data collection from Internet of Things (IoT) devices in an urban
environment. An autonomous drone is tasked with gathering data from distributed
sensor nodes subject to limited flying time and obstacle avoidance. While
previous approaches, learning and non-learning based, must perform expensive
recomputations or relearn a behavior when important scenario parameters such as
the number of sensors, sensor positions, or maximum flying time, change, we
train a double deep Q-network (DDQN) with combined experience replay to learn a
UAV control policy that generalizes over changing scenario parameters. By
exploiting a multi-layer map of the environment fed through convolutional
network layers to the agent, we show that our proposed network architecture
enables the agent to make movement decisions for a variety of scenario
parameters that balance the data collection goal with flight time efficiency
and safety constraints. Considerable advantages in learning efficiency from
using a map centered on the UAV's position over a non-centered map are also
illustrated.Comment: Code available under
https://github.com/hbayerlein/uav_data_harvesting, IEEE Global Communications
Conference (GLOBECOM) 202
Learning to Recharge: UAV Coverage Path Planning through Deep Reinforcement Learning
Coverage path planning (CPP) is a critical problem in robotics, where the
goal is to find an efficient path that covers every point in an area of
interest. This work addresses the power-constrained CPP problem with recharge
for battery-limited unmanned aerial vehicles (UAVs). In this problem, a notable
challenge emerges from integrating recharge journeys into the overall coverage
strategy, highlighting the intricate task of making strategic, long-term
decisions. We propose a novel proximal policy optimization (PPO)-based deep
reinforcement learning (DRL) approach with map-based observations, utilizing
action masking and discount factor scheduling to optimize coverage trajectories
over the entire mission horizon. We further provide the agent with a position
history to handle emergent state loops caused by the recharge capability. Our
approach outperforms a baseline heuristic, generalizes to different target
zones and maps, with limited generalization to unseen maps. We offer valuable
insights into DRL algorithm design for long-horizon problems and provide a
publicly available software framework for the CPP problem.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Heterogeneous Multi-Robot Collaboration for Coverage Path Planning in Partially Known Dynamic Environments
This research presents a cooperation strategy for a heterogeneous group of robots that comprises two Unmanned Aerial Vehicles (UAVs) and one Unmanned Ground Vehicles (UGVs) to perform tasks in dynamic scenarios. This paper defines specific roles for the UAVs and UGV within the framework to address challenges like partially known terrains and dynamic obstacles. The UAVs are focused on aerial inspections and mapping, while UGV conducts ground-level inspections. In addition, the UAVs can return and land at the UGV base, in case of a low battery level, to perform hot swapping so as not to interrupt the inspection process. This research mainly emphasizes developing a robust Coverage Path Planning (CPP) algorithm that dynamically adapts paths to avoid collisions and ensure efficient coverage. The Wavefront algorithm was selected for the two-dimensional offline CPP. All robots must follow a predefined path generated by the offline CPP. The study also integrates advanced technologies like Neural Networks (NN) and Deep Reinforcement Learning (DRL) for adaptive path planning for both robots to enable real-time responses to dynamic obstacles. Extensive simulations using a Robot Operating System (ROS) and Gazebo platforms were conducted to validate the approach considering specific real-world situations, that is, an electrical substation, in order to demonstrate its functionality in addressing challenges in dynamic environments and advancing the field of autonomous robots.The authors also would like to thank their home Institute, CEFET/RJ, the federal Brazilian
research agencies CAPES (code 001) and CNPq, and the Rio de Janeiro research agency, FAPERJ, for
supporting this work.info:eu-repo/semantics/publishedVersio
- …