17 research outputs found

    UAV Path Planning for Wireless Data Harvesting: A Deep Reinforcement Learning Approach

    Full text link
    Autonomous deployment of unmanned aerial vehicles (UAVs) supporting next-generation communication networks requires efficient trajectory planning methods. We propose a new end-to-end reinforcement learning (RL) approach to UAV-enabled data collection from Internet of Things (IoT) devices in an urban environment. An autonomous drone is tasked with gathering data from distributed sensor nodes subject to limited flying time and obstacle avoidance. While previous approaches, learning and non-learning based, must perform expensive recomputations or relearn a behavior when important scenario parameters such as the number of sensors, sensor positions, or maximum flying time, change, we train a double deep Q-network (DDQN) with combined experience replay to learn a UAV control policy that generalizes over changing scenario parameters. By exploiting a multi-layer map of the environment fed through convolutional network layers to the agent, we show that our proposed network architecture enables the agent to make movement decisions for a variety of scenario parameters that balance the data collection goal with flight time efficiency and safety constraints. Considerable advantages in learning efficiency from using a map centered on the UAV's position over a non-centered map are also illustrated.Comment: Code available under https://github.com/hbayerlein/uav_data_harvesting, IEEE Global Communications Conference (GLOBECOM) 202

    Learning to Recharge: UAV Coverage Path Planning through Deep Reinforcement Learning

    Full text link
    Coverage path planning (CPP) is a critical problem in robotics, where the goal is to find an efficient path that covers every point in an area of interest. This work addresses the power-constrained CPP problem with recharge for battery-limited unmanned aerial vehicles (UAVs). In this problem, a notable challenge emerges from integrating recharge journeys into the overall coverage strategy, highlighting the intricate task of making strategic, long-term decisions. We propose a novel proximal policy optimization (PPO)-based deep reinforcement learning (DRL) approach with map-based observations, utilizing action masking and discount factor scheduling to optimize coverage trajectories over the entire mission horizon. We further provide the agent with a position history to handle emergent state loops caused by the recharge capability. Our approach outperforms a baseline heuristic, generalizes to different target zones and maps, with limited generalization to unseen maps. We offer valuable insights into DRL algorithm design for long-horizon problems and provide a publicly available software framework for the CPP problem.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Model-aided Federated Reinforcement Learning for Multi-UAV Trajectory Planning in IoT Networks

    Full text link
    Deploying teams of cooperative unmanned aerial vehicles (UAVs) to harvest data from distributed Internet of Things (IoT) devices requires efficient trajectory planning and coordination algorithms. Multi-agent reinforcement learning (MARL) has emerged as an effective solution, but often requires extensive and costly real-world training data. In this paper, we propose a novel model-aided federated MARL algorithm to coordinate multiple UAVs on a data harvesting mission with limited knowledge about the environment, significantly reducing the real-world training data demand. The proposed algorithm alternates between learning an environment model from real-world measurements and federated QMIX training in the simulated environment. Specifically, collected measurements from the real-world environment are used to learn the radio channel and estimate unknown IoT device locations to create a simulated environment. Each UAV agent trains a local QMIX model in its simulated environment and continuously consolidates it through federated learning with other agents, accelerating the learning process and further improving training sample efficiency. Simulation results demonstrate that our proposed model-aided FedQMIX algorithm substantially reduces the need for real-world training experiences while attaining similar data collection performance as standard MARL algorithms.Comment: 7 pages, 2 figure

    Méthodes d’apprentissage automatique pour l’utilisation des drones dans les réseaux sans-fil

    No full text
    Autonomous unmanned aerial vehicles (UAVs), spurred by rapid innovation in drone hardware and regulatory frameworks during the last decade, are envisioned for a multitude of applications in service of the society of the future. From the perspective of next-generation wireless networks, UAVs are not only anticipated in the role of passive cellular-connected users, but also as active enablers of connectivity as part of UAV-aided networks. The defining advantage of UAVs in all potential application scenarios is their mobility. To take full advantage of their capabilities, flexible and efficient path planning methods are necessary. This thesis focuses on exploring machine learning (ML), specifically reinforcement learning (RL), as a promising class of solutions to UAV mobility management challenges. Deep RL is one of the few frameworks that allows us to tackle the complex task of UAV control and deployment in communication scenarios directly, given that these are generally NP-hard optimization problems and badly affected by non-convexity. Furthermore, deep RL offers the possibility to balance multiple objectives of UAV-aided networks in a straightforward way, it is very flexible in terms of the availability of prior or model information, while deep RL inference is computationally efficient. This thesis also explores the challenges of severely limited flying time, cooperation between multiple UAVs, and reducing the training data demand of DRL methods. The thesis also explores the connection between drone-assisted networks and robotics, two generally disjoint research communities.Les drones autonomes sont envisagés pour une multitude d'applications au service de la société du futur. Du point de vue des réseaux sans-fil de la prochaine génération, les drones ne sont pas seulement prévus dans le rôle d'utilisateurs passifs connectés au réseau cellulaire, mais aussi comme facilitateurs actifs de la connectivité dans le cadre de réseaux assistés par drones. L'avantage déterminant des drones dans tous les scénarios d'application potentiels est leur mobilité. Pour tirer pleinement parti de leurs capacités, des méthodes de planification de trajectoire flexibles et efficaces sont une nécessité impérative. Cette thèse se concentre sur l'exploration de l'apprentissage automatique, en particulier l'apprentissage par renforcement (RL), comme une classe prometteuse de solutions aux défis de la gestion de la mobilité des drones. L'apprentissage par renforcement profond est l'un des rares cadres qui nous permet de nous attaquer directement à la tâche complexe du contrôle des drones dans les scénarios de communication, étant donné qu'il s'agit généralement de problèmes d'optimisation non convexes et NP-difficile. De plus, le RL profond offre la possibilité d'équilibrer les objectifs multiples de manière directe, il est très flexible en termes de disponibilité d'informations préalables ou de modèles, tandis que l'inférence RL profonde est efficace sur le plan informatique. Cette thèse explore également les défis que représentent un temps de vol fortement limité, la coopération entre plusieurs drones et la réduction de la demande de données d'entraînement. La thèse explore aussi la connexion entre les réseaux assistés par drone et la robotique

    Machine learning methods for UAV-aided wireless networks

    No full text
    corecore