4 research outputs found

    Assessing Multi-Agent Reinforcement Learning Algorithms for Autonomous Sensor Resource Management

    Get PDF
    Unmanned aerial vehicles (UAVs) have applications in search and rescue operations and such operations could be more efficient by using appropriate artificial intelligence (AI) to enable a UAV agent to operate autonomously. Sensor resource management (SRM), which leverages capabilities across location intelligence, facilitates the efficient and effective use of UAVs and their sensors to complete a set of tasks. Furthermore, multiple UAVs, each with different sensor configurations, must be considered when maximizing mission effects. Instantiating operational autonomy for such teams requires considerable coordination. One AI approach relevant to this task is multi-agent reinforcement learning (MARL). However, MARL has seen limited prior use in SRM. This work evaluates the trade-space of MARL algorithms with respect to performing heterogeneous sensor resource management (SRM) tasks, considers the concept of evaluating MARL in a test and evaluation framework and compares a suit of algorithms with random and Bayesian hyperparameter optimization methods

    UAV Maneuvering Target Tracking in Uncertain Environments based on Deep Reinforcement Learning and Meta-learning

    Get PDF
    This paper combines Deep Reinforcement Learning (DRL) with Meta-learning and proposes a novel approach, named Meta Twin Delayed Deep Deterministic policy gradient (Meta-TD3), to realize the control of Unmanned Aerial Vehicle (UAV), allowing a UAV to quickly track a target in an environment where the motion of a target is uncertain. This approach can be applied to a variety of scenarios, such as wildlife protection, emergency aid, and remote sensing. We consider multi-tasks experience replay buffer to provide data for multi-tasks learning of DRL algorithm, and we combine Meta-learning to develop a multi-task reinforcement learning update method to ensure the generalization capability of reinforcement learning. Compared with the state-of-the-art algorithms, Deep Deterministic Policy Gradient (DDPG) and Twin Delayed Deep Deterministic policy gradient (TD3), experimental results show that the Meta-TD3 algorithm has achieved a great improvement in terms of both convergence value and convergence rate. In a UAV target tracking problem, Meta-TD3 only requires a few steps to train to enable a UAV to adapt quickly to a new target movement mode more and maintain a better tracking effectiveness

    Distributed Reinforcement Learning for Flexible and Efficient UAV Swarm Control

    Full text link
    Over the past few years, the use of swarms of Unmanned Aerial Vehicles (UAVs) in monitoring and remote area surveillance applications has become widespread thanks to the price reduction and the increased capabilities of drones. The drones in the swarm need to cooperatively explore an unknown area, in order to identify and monitor interesting targets, while minimizing their movements. In this work, we propose a distributed Reinforcement Learning (RL) approach that scales to larger swarms without modifications. The proposed framework relies on the possibility for the UAVs to exchange some information through a communication channel, in order to achieve context-awareness and implicitly coordinate the swarm's actions. Our experiments show that the proposed method can yield effective strategies, which are robust to communication channel impairments, and that can easily deal with non-uniform distributions of targets and obstacles. Moreover, when agents are trained in a specific scenario, they can adapt to a new one with minimal additional training. We also show that our approach achieves better performance compared to a computationally intensive look-ahead heuristic.Comment: Preprint of the paper published in IEEE Transactions on Cognitive Communications and Networking ( Early Access