379 research outputs found
Bayesian Optimization Enhanced Deep Reinforcement Learning for Trajectory Planning and Network Formation in Multi-UAV Networks
In this paper, we employ multiple UAVs coordinated by a base station (BS) to
help the ground users (GUs) to offload their sensing data. Different UAVs can
adapt their trajectories and network formation to expedite data transmissions
via multi-hop relaying. The trajectory planning aims to collect all GUs' data,
while the UAVs' network formation optimizes the multi-hop UAV network topology
to minimize the energy consumption and transmission delay. The joint network
formation and trajectory optimization is solved by a two-step iterative
approach. Firstly, we devise the adaptive network formation scheme by using a
heuristic algorithm to balance the UAVs' energy consumption and data queue
size. Then, with the fixed network formation, the UAVs' trajectories are
further optimized by using multi-agent deep reinforcement learning without
knowing the GUs' traffic demands and spatial distribution. To improve the
learning efficiency, we further employ Bayesian optimization to estimate the
UAVs' flying decisions based on historical trajectory points. This helps avoid
inefficient action explorations and improves the convergence rate in the model
training. The simulation results reveal close spatial-temporal couplings
between the UAVs' trajectory planning and network formation. Compared with
several baselines, our solution can better exploit the UAVs' cooperation in
data offloading, thus improving energy efficiency and delay performance.Comment: 15 pages, 10 figures, 2 algorithm
Power allocation and energy cooperation for UAV-enabled MmWave networks: A Multi-Agent Deep Reinforcement Learning approach
Unmanned Aerial Vehicle (UAV)-assisted cellular networks over the millimeter-wave (mmWave) frequency band can meet the requirements of a high data rate and flexible coverage in next-generation communication networks. However, higher propagation loss and the use of a large number of antennas in mmWave networks give rise to high energy consumption and UAVs are constrained by their low-capacity onboard battery. Energy harvesting (EH) is a viable solution to reduce the energy cost of UAV-enabled mmWave networks. However, the random nature of renewable energy makes it challenging to maintain robust connectivity in UAV-assisted terrestrial cellular networks. Energy cooperation allows UAVs to send their excessive energy to other UAVs with reduced energy. In this paper, we propose a power allocation algorithm based on energy harvesting and energy cooperation to maximize the throughput of a UAV-assisted mmWave cellular network. Since there is channel-state uncertainty and the amount of harvested energy can be treated as a stochastic process, we propose an optimal multi-agent deep reinforcement learning algorithm (DRL) named Multi-Agent Deep Deterministic Policy Gradient (MADDPG) to solve the renewable energy resource allocation problem for throughput maximization. The simulation results show that the proposed algorithm outperforms the Random Power (RP), Maximal Power (MP) and value-based Deep Q-Learning (DQL) algorithms in terms of network throughput.This work was supported by the Agencia Estatal de Investigación of Ministerio de Ciencia e Innovación of Spain under project PID2019-108713RB-C51 MCIN/AEI /10.13039/501100011033Postprint (published version
Meta-Reinforcement Learning for Timely and Energy-efficient Data Collection in Solar-powered UAV-assisted IoT Networks
Unmanned aerial vehicles (UAVs) have the potential to greatly aid Internet of
Things (IoT) networks in mission-critical data collection, thanks to their
flexibility and cost-effectiveness. However, challenges arise due to the UAV's
limited onboard energy and the unpredictable status updates from sensor nodes
(SNs), which impact the freshness of collected data. In this paper, we
investigate the energy-efficient and timely data collection in IoT networks
through the use of a solar-powered UAV. Each SN generates status updates at
stochastic intervals, while the UAV collects and subsequently transmits these
status updates to a central data center. Furthermore, the UAV harnesses solar
energy from the environment to maintain its energy level above a predetermined
threshold. To minimize both the average age of information (AoI) for SNs and
the energy consumption of the UAV, we jointly optimize the UAV trajectory, SN
scheduling, and offloading strategy. Then, we formulate this problem as a
Markov decision process (MDP) and propose a meta-reinforcement learning
algorithm to enhance the generalization capability. Specifically, the
compound-action deep reinforcement learning (CADRL) algorithm is proposed to
handle the discrete decisions related to SN scheduling and the UAV's offloading
policy, as well as the continuous control of UAV flight. Moreover, we
incorporate meta-learning into CADRL to improve the adaptability of the learned
policy to new tasks. To validate the effectiveness of our proposed algorithms,
we conduct extensive simulations and demonstrate their superiority over other
baseline algorithms
A Hybrid Framework of Reinforcement Learning and Convex Optimization for UAV-Based Autonomous Metaverse Data Collection
Unmanned aerial vehicles (UAVs) are promising for providing communication
services due to their advantages in cost and mobility, especially in the
context of the emerging Metaverse and Internet of Things (IoT). This paper
considers a UAV-assisted Metaverse network, in which UAVs extend the coverage
of the base station (BS) to collect the Metaverse data generated at roadside
units (RSUs). Specifically, to improve the data collection efficiency, resource
allocation and trajectory control are integrated into the system model. The
time-dependent nature of the optimization problem makes it non-trivial to be
solved by traditional convex optimization methods. Based on the proposed
UAV-assisted Metaverse network system model, we design a hybrid framework with
reinforcement learning and convex optimization to {cooperatively} solve the
time-sequential optimization problem. Simulation results show that the proposed
framework is able to reduce the mission completion time with a given
transmission power resource.Comment: This paper appears in IEEE Network magazin
AI-based Radio and Computing Resource Allocation and Path Planning in NOMA NTNs: AoI Minimization under CSI Uncertainty
In this paper, we develop a hierarchical aerial computing framework composed
of high altitude platform (HAP) and unmanned aerial vehicles (UAVs) to compute
the fully offloaded tasks of terrestrial mobile users which are connected
through an uplink non-orthogonal multiple access (UL-NOMA). To better assess
the freshness of information in computation-intensive applications the
criterion of age of information (AoI) is considered. In particular, the problem
is formulated to minimize the average AoI of users with elastic tasks, by
adjusting UAVs trajectory and resource allocation on both UAVs and HAP, which
is restricted by the channel state information (CSI) uncertainty and multiple
resource constraints of UAVs and HAP. In order to solve this non-convex
optimization problem, two methods of multi-agent deep deterministic policy
gradient (MADDPG) and federated reinforcement learning (FRL) are proposed to
design the UAVs trajectory, and obtain channel, power, and CPU allocations. It
is shown that task scheduling significantly reduces the average AoI. This
improvement is more pronounced for larger task sizes. On one hand, it is shown
that power allocation has a marginal effect on the average AoI compared to
using full transmission power for all users. Compared with traditional
transmission schemes, the simulation results show our scheduling scheme results
in a substantial improvement in average AoI
Location prediction and trajectory optimization in multi-UAV application missions
Unmanned aerial vehicles (a.k.a. drones) have a wide range of applications in e.g., aerial surveillance, mapping, imaging, monitoring, maritime operations, parcel delivery, and disaster response management. Their operations require reliable networking environments and location-based services in air-to-air links with cooperative drones, or air-to-ground links in concert with ground control stations. When equipped with high-resolution video cameras or sensors to gain environmental situation awareness through object detection/tracking, precise location predictions of individual or groups of drones at any instant possible is critical for continuous guidance. The location predictions then can be used in trajectory optimization for achieving efficient operations (i.e., through effective resource utilization in terms of energy or network bandwidth consumption) and safe operations (i.e., through avoidance of obstacles or sudden landing) within application missions. In this thesis, we explain a diverse set of techniques involved in drone location prediction, position and velocity estimation and trajectory optimization involving: (i) Kalman Filtering techniques, and (ii) Machine Learning models such as reinforcement learning and deep-reinforcement learning. These techniques facilitate the drones to follow intelligent paths and establish optimal trajectories while carrying out successful application missions under given resource and network constraints. We detail the techniques using two scenarios. The first scenario involves location prediction based intelligent packet transfer between drones in a disaster response scenario using the various Kalman Filtering techniques. The second scenario involves a learning-based trajectory optimization that uses various reinforcement learning models for maintaining high video resolution and effective network performance in a civil application scenario such as aerial monitoring of persons/objects. We conclude with a list of open challenges and future works for intelligent path planning of drones using location prediction and trajectory optimization techniques.Includes bibliographical references
Joint Trajectory and Passive Beamforming Design for Intelligent Reflecting Surface-Aided UAV Communications: A Deep Reinforcement Learning Approach
In this paper, the intelligent reflecting surface (IRS)-assisted unmanned
aerial vehicle (UAV) communication system is studied, where an UAV is deployed
to serve the user equipments (UEs) with the assistance of multiple IRSs mounted
on several buildings to enhance the communication quality between UAV and UEs.
We aim to maximize the overall weighted data rate and geographical fairness of
all the UEs via jointly optimizing the UAV's trajectory and the phase shifts of
reflecting elements of IRSs. Since the system is complex and the environment is
dynamic, it is challenging to derive low-complexity algorithms by using
conventional optimization methods. To address this issue, we first propose a
deep Q-network (DQN)-based low-complex solution by discretizing the trajectory
and phase shift, which is suitable for practical systems with discrete
phase-shift control. Furthermore, we propose a deep deterministic policy
gradient (DDPG)-based solution to tackle the case with continuous trajectory
and phase shift design. The experimental results prove that the proposed
solutions achieve better performance compared to other traditional benchmarks.Comment: 12 pages, 13 figure
- …