4 research outputs found
Assessing Multi-Agent Reinforcement Learning Algorithms for Autonomous Sensor Resource Management
Unmanned aerial vehicles (UAVs) have applications in search and rescue operations and such operations could be more efficient by using appropriate artificial intelligence (AI) to enable a UAV agent to operate autonomously. Sensor resource management (SRM), which leverages capabilities across location intelligence, facilitates the efficient and effective use of UAVs and their sensors to complete a set of tasks. Furthermore, multiple UAVs, each with different sensor configurations, must be considered when maximizing mission effects. Instantiating operational autonomy for such teams requires considerable coordination. One AI approach relevant to this task is multi-agent reinforcement learning (MARL). However, MARL has seen limited prior use in SRM. This work evaluates the trade-space of MARL algorithms with respect to performing heterogeneous sensor resource management (SRM) tasks, considers the concept of evaluating MARL in a test and evaluation framework and compares a suit of algorithms with random and Bayesian hyperparameter optimization methods
UAV Maneuvering Target Tracking in Uncertain Environments based on Deep Reinforcement Learning and Meta-learning
This paper combines Deep Reinforcement Learning (DRL) with Meta-learning and proposes a novel approach, named Meta Twin Delayed Deep Deterministic policy gradient (Meta-TD3), to realize the control of Unmanned Aerial Vehicle (UAV), allowing a UAV to quickly track a target in an environment where the motion of a target is uncertain. This approach can be applied to a variety of scenarios, such as wildlife protection, emergency aid, and remote sensing. We consider multi-tasks experience replay buffer to provide data for multi-tasks learning of DRL algorithm, and we combine Meta-learning to develop a multi-task reinforcement learning update method to ensure the generalization capability of reinforcement learning. Compared with the state-of-the-art algorithms, Deep Deterministic Policy Gradient (DDPG) and Twin Delayed Deep Deterministic policy gradient (TD3), experimental results show that the Meta-TD3 algorithm has achieved a great improvement in terms of both convergence value and convergence rate. In a UAV target tracking problem, Meta-TD3 only requires a few steps to train to enable a UAV to adapt quickly to a new target movement mode more and maintain a better tracking effectiveness
Distributed Reinforcement Learning for Flexible and Efficient UAV Swarm Control
Over the past few years, the use of swarms of Unmanned Aerial Vehicles (UAVs)
in monitoring and remote area surveillance applications has become widespread
thanks to the price reduction and the increased capabilities of drones. The
drones in the swarm need to cooperatively explore an unknown area, in order to
identify and monitor interesting targets, while minimizing their movements. In
this work, we propose a distributed Reinforcement Learning (RL) approach that
scales to larger swarms without modifications. The proposed framework relies on
the possibility for the UAVs to exchange some information through a
communication channel, in order to achieve context-awareness and implicitly
coordinate the swarm's actions. Our experiments show that the proposed method
can yield effective strategies, which are robust to communication channel
impairments, and that can easily deal with non-uniform distributions of targets
and obstacles. Moreover, when agents are trained in a specific scenario, they
can adapt to a new one with minimal additional training. We also show that our
approach achieves better performance compared to a computationally intensive
look-ahead heuristic.Comment: Preprint of the paper published in IEEE Transactions on Cognitive
Communications and Networking ( Early Access