Coverage path planning (CPP) is a critical problem in robotics, where the
goal is to find an efficient path that covers every point in an area of
interest. This work addresses the power-constrained CPP problem with recharge
for battery-limited unmanned aerial vehicles (UAVs). In this problem, a notable
challenge emerges from integrating recharge journeys into the overall coverage
strategy, highlighting the intricate task of making strategic, long-term
decisions. We propose a novel proximal policy optimization (PPO)-based deep
reinforcement learning (DRL) approach with map-based observations, utilizing
action masking and discount factor scheduling to optimize coverage trajectories
over the entire mission horizon. We further provide the agent with a position
history to handle emergent state loops caused by the recharge capability. Our
approach outperforms a baseline heuristic, generalizes to different target
zones and maps, with limited generalization to unseen maps. We offer valuable
insights into DRL algorithm design for long-horizon problems and provide a
publicly available software framework for the CPP problem.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl