In this paper, we employ multiple UAVs coordinated by a base station (BS) to
help the ground users (GUs) to offload their sensing data. Different UAVs can
adapt their trajectories and network formation to expedite data transmissions
via multi-hop relaying. The trajectory planning aims to collect all GUs' data,
while the UAVs' network formation optimizes the multi-hop UAV network topology
to minimize the energy consumption and transmission delay. The joint network
formation and trajectory optimization is solved by a two-step iterative
approach. Firstly, we devise the adaptive network formation scheme by using a
heuristic algorithm to balance the UAVs' energy consumption and data queue
size. Then, with the fixed network formation, the UAVs' trajectories are
further optimized by using multi-agent deep reinforcement learning without
knowing the GUs' traffic demands and spatial distribution. To improve the
learning efficiency, we further employ Bayesian optimization to estimate the
UAVs' flying decisions based on historical trajectory points. This helps avoid
inefficient action explorations and improves the convergence rate in the model
training. The simulation results reveal close spatial-temporal couplings
between the UAVs' trajectory planning and network formation. Compared with
several baselines, our solution can better exploit the UAVs' cooperation in
data offloading, thus improving energy efficiency and delay performance.Comment: 15 pages, 10 figures, 2 algorithm